URIの空白を'+'にするという仕様

検索するときのURLがfoo+barという形式になる方がfoo%20barという形式よりもはるかに読みやすい．だけど，仕様として考えると一律%XXに変換する方が一貫性があるよなと．

仕様としてはどうなっているのかなとちょっと調べてみた．

以下のようにHTML Formのw3cの勧告とCGIの仕様には明確に書かれてあるように思える．

This is the default content type. Forms submitted with this content type must be encoded as follows:
Control names and values are escaped. Space characters are replaced by `+', and then reserved characters are escaped as described in [RFC1738], section 2.2: Non-alphanumeric characters are replaced by `%HH', a percent sign and two hexadecimal digits representing the ASCII code of the character. Line breaks are represented as "CR LF" pairs (i.e., `%0D%0A').
The control names/values are listed in the order they appear in the document. The name is separated from the value by `=' and name/value pairs are separated from each other by `&'.
Forms in HTML documents

Form data is a stream of name=value pairs separated by the & character. Each name=value pair is URL encoded, i.e. spaces are changed into plusses and some characters are encoded into hexadecimal.

現在のURIのRFCとしては特に書かれていないように思えたのだけども，初期のRFCを見ると書かれてあった．

Within the query string, the plus sign is reserved as shorthand
notation for a space. Therefore, real plus signs must be encoded.
This method was used to make query URIs easier to pass in systems
which did not allow spaces.
RFC 1630 - Universal Resource Identifiers in WWW: A Unifying Syntax for the Expression of Names and Addresses of Objects on the Network as used in the World-Wide Web

ただ，現在のRFCもクエリ文字列の部分は%エンコーディング以外のものを使った方がいいかもと書かれてある．

However, as query components
are often used to carry identifying information in the form of
"key=value" pairs and one frequently used value is a reference to
another URI, it is sometimes better for usability to avoid percent-
encoding those characters.
RFC 3986 - Uniform Resource Identifier (URI): Generic Syntax

てきとうなメモ

本の感想とか技術メモとか

URIの空白を'+'にするという仕様