Mikey

Mikey

  • NA
  • 1
  • 1.9k

Request Help with Regex

Aug 27 2004 2:50 AM
I have a text file, which among other things contains the following SINGLE line: Internal LinksIntranet Pages
Email
Hotmail Email

I want as result the following: http://www.hotmail.com,Hotmail Email Please help me, your help is greatly appreciated MJCM
Company I am using regex and i am using the following expressions: 1) r = New Regex("(?:""kop\x22\x3E)(\w[A-Z]\S+\x20\S+)(?:\b\x3c)", RegexOptions.IgnoreCase Or RegexOptions.IgnorePatternWhitespace Or RegexOptions.Compiled) this gives as result: Internal Links 2) r = New Regex("(?:""kop\x22\x3E)(\w[A-Z]\S+|\x20|\S+)(?:\b\x3c)", RegexOptions.IgnoreCase Or RegexOptions.IgnorePatternWhitespace Or RegexOptions.Compiled) this gives as result: Company 3) r = New Regex("href\s*=\s*(?:""(?<1>[^""]*)""|(?<1>\S+))", RegexOptions.IgnoreCase Or RegexOptions.Compiled) this gives the following result http://intranet.company.com/ http://intranet.company.com/email/ http://www.hotmail.com/ I want as output a text file with one of the following: 1) Internal Links url1=http://intranet.company.uk/,Intranet Pages url2=http://intranet.company.de/email/,Email url3=http://www.hotmail.com/,Hotmail Email Company url1=http://www.test.com/,Test Url url2=http://sonelink.url.nl/,Somelink Url 2) Internal Links,http://intranet.company.uk/,Intranet Pages Internal Links,http://intranet.company.de/email/,Email Internal Links,http://www.hotmail.com/,Hotmail Email Company,url1=http://www.test.com/,Test Url Company,http://sonelink.url.nl,Somelink Url My Questions: 1) : How can I combine the regex expresions Nr.1 and Nr.2 into 1 single expression (these regex expressions don't work flawlessly) 2) : How can I group the Urls (found with regex expression 3) so that they form a single category (per group) for example Under "Internal Links" only the following urls must appear http://intranet.company.com/ http://intranet.company.com/email/ http://www.hotmail.com/ but the urls under Company must appear under their Own group So "Company" will only contain the following urls: http://www.test.com/ http://sonelink.url.nl/ 3): How can I extract the url and the comment from the following line? Hotmail Email