invalid byte sequence in UTF-8
Often we come across this #FiveWordTechHorror “invalid byte sequence in UTF-8” while working on Ruby projects or projects built on Ruby Framework (Ex : Rails).
This error occurs when we try to decode any string which has foreign characters (not plain English) which are submitted either through Javascript or say form using Rails and are not properly encoded .
So, the question is how to properly encode & decode ?
-
When submitted through Javascript :-
Use encodeURI or
encodeURI()
orencodeURIComponent()
method to encode.Ex : Suppose name contains unicode characters.
encoded_name = encodeURI(name)
On ruby side decode it using
CGI::unescape
method.Ex: Suppose encoded_name in params is received as params[:encoded_name]
name = CGI::unescape(param[:encoded_name])
-
When submitted through Rails form :-
i) Use
CGI::escape
method to encode it andCGI::unescape
method to decode it.Example:
OR
ii) Use
unpack
method of String class and thenpack
it using pack method of Array class.Example: