On Mon, Jun 14, 2004 at 10:00:55AM +0100, James Gunn wrote:
*** Before acting on this e-mail or opening any attachment you are advised to read the disclaimer at the end of this e-mail ***
Hi all,
I'm having some difficulty getting my regular expression working. Basically, I need to make sure that a UK postcode is valid. The postcode that is passed to my function sometimes has extra things with it such as:
Wakefield, WF1 3RD Shrewsbury SY2 5PT Shropshire
Given that you've got messy data, have you considered matching everything that might be a valid post code and then checking that against a symbol table? That would reduce the complexity of your regex a lot. You could also Spirit, though the learning curve on that is a bit steep.
It now seems to be failing to find the postcode in the above examples. Also, when I pass my function a postcode that I know is invalid, such as JG2 7L5 it matches it as G2 7L5 instead of failing to do the match.
However, part of the problem with your regex is that J only matches if it is with JE (it's between I[GMPV] and K[ATWY]) Hope this helps. Here is the changed regex: --- cut --- (?:(?:(^|\s)+ A[BL]|B[ABDHLNRST]?| C[ABFHMORTVW]|D[ADEGHLNTY]|E[CHNX]?|F[KY]|G[LUY]?| H[ADGPRSUX]|I[GMPV]|J[GE]|K[ATWY]|L[ADELNSU]?|M[EKL]?| N[EGNPRW]?|O[LX]|P[AEHLOR]|R[GHM]|S[AEGKLMNOPRSTWY]?| T[0ADFNQRSW]|UB|W[ACDFNRSV]?|YO|ZE) \d(?:\d|[A-Z])?\s+\d[A-Z]{2}($|\s)+) --- cut --- -jbs
The regular expression I'm using is below:
(?:(?:(^|\s)+ A[BL]|B[ABDHLNRST]?| C[ABFHMORTVW]|D[ADEGHLNTY]|E[CHNX]?|F[KY]|G[LUY]?| H[ADGPRSUX]|I[GMPV]|JE|K[ATWY]|L[ADELNSU]?|M[EKL]?| N[EGNPRW]?|O[LX]|P[AEHLOR]|R[GHM]|S[AEGKLMNOPRSTWY]?| T[0ADFNQRSW]|UB|W[ACDFNRSV]?|YO|ZE) \d(?:\d|[A-Z])?\s+\d[A-Z]{2}($|\s)+)
Can anyone tell me whats wrong with my expression? BTW I'm using boost 1.31.0 on VC++ 7.1, Windows XP.