Home > HTML > How to escape special characters

How to escape special characters

February 26th, 2009

It took me a while to look through Erlang Documentation and found the utility function edoc_lib:escape_uri/1

1> edoc_lib:escape_uri("This sentence contains & % @ \n to be escaped").
"This%20sentence%20contains%20%26%20%25%20%40%20%a%20to%20be%20escaped"

However, the function is not fully compatible if you use the function to escape query string when making HTTP Post

Let’s try to escape some special characters

2> P1=edoc_lib:escape_uri("This sentence contains & % @").
"This%20sentence%20contains%20%26%20%25%20%40"
3> QueryString1="param1=" ++ P1.
"param1=This%20sentence%20contains%20%26%20%25%20%40"
4> http:request(post, {"http://erlangexamples.com/examples/post.php", [], 
4> "application/x-www-form-urlencoded", QueryString1}, [], []).
{ok,{{"HTTP/1.1",200,"OK"},
[{"connection","Keep-Alive"},
{"date","Fri, 27 Feb 2009 15:47:50 GMT"},
{"server",
"Apache/1.3.41 (Unix) Sun-ONE-ASP/4.0.2 Resin/3.0.25 mod_fastcgi/2.4.6 mod_log_bytes/1.2 mod_bwlimited/1.4 mod_auth_passthrough/1.8 FrontPage/5.0.2.2635 mod_ssl/2.8.31 OpenSSL/0.9.7a"},
{"content-length","59"},
{"content-type","text/html"},
{"x-powered-by","PHP/4.4.9"},
{"keep-alive","timeout=5, max=10"}],
"\nOnly accept POST:\nparam1 = 'This sentence contains & % @'\n"}}

It looks okie, now try some others

1> inets:start().
ok
2> P2=edoc_lib:escape_uri("newline\ncharacter").
"newline%acharacter"
3> QueryString2="p2=" ++ P2.
"p2=newline%acharacter"
4> http:request(post, {"http://erlangexamples.com/examples/post.php", [], 
4> "application/x-www-form-urlencoded", QueryString2}, [], []).
{ok,{{"HTTP/1.1",200,"OK"},
[{"connection","Keep-Alive"},
{"date","Fri, 27 Feb 2009 15:51:49 GMT"},
{"server",
"Apache/1.3.41 (Unix) Sun-ONE-ASP/4.0.2 Resin/3.0.25 mod_fastcgi/2.4.6 mod_log_bytes/1.2 mod_bwlimited/1.4 mod_auth_passthrough/1.8 FrontPage/5.0.2.2635 mod_ssl/2.8.31 OpenSSL/0.9.7a"},
{"content-length","85"},
{"content-type","text/html"},
{"x-powered-by","PHP/4.4.9"},
{"keep-alive","timeout=5, max=10"}],
"\nOnly accept POST:\np2 = 'newline?haracter'\n"}}

You can see that edoc_lib:escape_uri/1 failed to escape newline character. Although it translated \n -> %a, %0a must be the correct translation

5> QueryString3="p2=newline%0acharacter".
"p2=newline%0acharacter"
6> http:request(post, {"http://erlangexamples.com/examples/post.php", [], 
6> "application/x-www-form-urlencoded", QueryString3}, [], []).
{ok,{{"HTTP/1.1",200,"OK"},
[{"connection","Keep-Alive"},
{"date","Fri, 27 Feb 2009 15:54:55 GMT"},
{"server",
"Apache/1.3.41 (Unix) Sun-ONE-ASP/4.0.2 Resin/3.0.25 mod_fastcgi/2.4.6 mod_log_bytes/1.2 mod_bwlimited/1.4 mod_auth_passthrough/1.8 FrontPage/5.0.2.2635 mod_ssl/2.8.31 OpenSSL/0.9.7a"},
{"content-length","84"},
{"content-type","text/html"},
{"x-powered-by","PHP/4.4.9"},
{"keep-alive","timeout=5, max=10"}],
"\nOnly accept POST:\np2 = 'newline\ncharacter'\n"}}

I’ve made minor modification to edoc_lib:escape_uri/1 to make it works

escape_uri([C | Cs]) when C >= $a, C =< $z ->
    [C | escape_uri(Cs)];
escape_uri([C | Cs]) when C >= $A, C =< $Z ->
    [C | escape_uri(Cs)];
escape_uri([C | Cs]) when C >= $0, C =< $9 ->
    [C | escape_uri(Cs)];
escape_uri([C = $. | Cs]) ->
    [C | escape_uri(Cs)];
escape_uri([C = $- | Cs]) ->
    [C | escape_uri(Cs)];
escape_uri([C = $_ | Cs]) ->
    [C | escape_uri(Cs)];
escape_uri([C | Cs]) when C > 16#7f ->
    %% This assumes that characters are at most 16 bits wide.
    escape_byte(((C band 16#c0) bsr 6) + 16#c0)
    ++ escape_byte(C band 16#3f + 16#80)
    ++ escape_uri(Cs);
escape_uri([C | Cs]) -> escape_byte(C) ++ escape_uri(Cs);
escape_uri([]) -> [].

escape_byte(C) ->
    H = hex_octet(C),
    % io:fwrite("~p - ~p~n", [C, H]),
    normalize(H).

%% Append 0 if length == 1
normalize(H) when length(H) == 1 -> "%0" ++ H;
normalize(H) -> "%" ++ H.

HTML , , ,

  1. Francesco
    August 11th, 2009 at 04:33 | #1

    Hi, I’m trying to use this function. Actually it works in my local machine, either via erl shell either in a ejabberd module. But if I try to use this ejabberd module in production, when it gets to the point where it has to do edoc_lib:escape_uri the result is this :

    E(:ejabberd_hooks:190) : {undef,
    [{edoc_lib,escape_uri,
    ["asidj"]},
    {mod_test,save_message,1},
    {mod_test,on_message_sent,3},
    {ejabberd_hooks,run1,3},
    {ejabberd_c2s,session_established2,2},
    {gen_fsm,handle_msg,7},
    {proc_lib,wake_up,3}]}

    Seems like the function is undefined. You know these erlang errors are not so explicit… Do I need to include something in order to have it work?

    What I don’t understand is why the same function edoc_lib:escape_uri works in production if called inside an erlang shell…

  2. Francesco
    August 11th, 2009 at 04:58 | #2

    Ok I found out, edoc_lib.beam was not in the path… my fault!

  1. August 11th, 2009 at 22:31 | #1