Just my thought is, why not accept all as a valid token, except the onces that are know (or expected) to cause problems?
I could assume that characters like , or or \ and ; can cause unhappy side effects. Having a limit on the hash sounds like something usefull
↧
What should hashtags support?
↧