Searching Excel docs for phone numbers

Hi all,

I was wondering if anyone knows a way I could search for phone numbers using the keywords but be able to get all of the different ways a number can be written.

E.g.
I’m looking for +33109758351 but I can’t find the number if it saved like -
+33 1 09 75 83 51
0109758351
0109 758351
0033109758351
and so on.

I know I could enter all the different variations of the number e.g. “0109758351 OR 0033109758351” but if I have many numbers to search this would take a while.

I understand there will be more false hits the more variations I put in.

Any help or comments would be really appreciated.

Thanks in advance!

Based on your examples, you could try regular expression ‘+?[0-9 ]{10,16}’

% echo '+33 1 09 75 83 51
0109758351
0109 758351
0033109758351' | grep -E '\+?[0-9 ]{10,16}'
+33 1 09 75 83 51
0109758351
0109 758351
0033109758351

Now, to be sure, that is going to lead to a lot of false hits because it will match any string of numbers and spaces at least 10 long, which you will find aplenty.

You could throw in word boundaries, '\b\+?[0-9 ]{10,16}\b', but I don’t think this would appreciably reduce your false hits.

% echo '+33 1 09 75 83 51
0109758351
0109 758351
0033109758351' | grep -E '\b\+?[0-9 ]{10,16}\b'
+33 1 09 75 83 51
0109758351
0109 758351
0033109758351

The only real way to limit false hits, because of the variety of ways an International phone number can be represented, is to use a regular expression that matches only the formats you wish to match '\b\+?[0-9]{2} [0-9]( [0-9]{2}){4}\b|\b[0-9]{10}\b|\b[0-9]{4} [0-9]{6}\b|\b[0-9]{13}\b' +33 1 09 75 83 51

% echo '+33 1 09 75 83 51
0109758351
0109 758351
0033109758351' | grep -E '\b\+?[0-9]{2} [0-9]( [0-9]{2}){4}\b|\b[0-9]{10}\b|\b[0-9]{4} [0-9]{6}\b|\b[0-9]{13}\b'
+33 1 09 75 83 51
0109758351
0109 758351
0033109758351

I realize that all the expressions above work, but throw in some number grouping that are not consistent with phone numbers and see how each does better or worse at filtering out false hits.