newbie question on tokenizer and unicode text

21 Sep 2002

      Hi all,

I'm a newcomer to boost and am rewriting one of my text 
characterization programs to use some boost libraries. I'm running 
into a problem with non ascii text.

I'm using MSVC 7.0 to compile my program and the project is setup to 
link to the unicode CRT.

The problem is when boost::char_delimiters_separator<>::is_nonret is 
calling MSVC's CRT std::isspace() which takes an int; and isspace() 
is (of course) choking on a '©' (for example) in my training data. 

Is there something obvious I'm missing?

Thanks in advance.

-Eric Lucas

newbie question on tokenizer and unicode text

elqed1