AspAlliance.com LogoASPAlliance: Articles, reviews, and samples for .NET Developers
URL:
http://aspalliance.com/articleViewer.aspx?aId=808&pId=-1
CodeSnip: Using the IsMatch method in Regular Expressions to screen scrape a webpage
page
by Web Team at ORCS Web
Feedback
Average Rating: This article has not yet been rated.
Views (Total / Last 10 Days): 8282/ 18

This code-tip I discovered while developing a webservice to "screen scrape" a webpage to determine if a certain text phrase was present.  Regular expressions are best suited for achieving this task; however, they are not the easiest to learn. 

The System.Text.RegularExpressions namespace in .NET 2.0 has a handy function called IsMatch that achieves what I wanted. The code snippet below accepts two arguments (the URL to monitor, the Text to search for), makes an HTTP request and reads the webpage into a stream.  The stream is searched for the text passed into the method. The one thing I discovered while using the 'IsMatch' method is that the text is case and space sensitive.  For example, if you are searching "http://www.iislogs.com" for text in the title of the page, searching for "IIS Logs -" is the exact phrase that would be searched for.

Listing 1

Public Function URLListed(ByVal URL As String, ByValstrArgument As StringAs String
  Dim blnListed As String
  blnListed = readWebPage(URL, strArgument)
  Return blnListed
End Function

Private Function readWebPage(ByVal strSource As StringByVal strArgument AsString) As String
  Dim strLine As String
  Dim objSR As System.IO.StreamReader = Nothing
  Dim objResponse As WebResponse = Nothing
  Dim objRequest As WebRequest =System.Net.HttpWebRequest.Create(strSource)
 
  Try
  objResponse = objRequest.GetResponse
  objSR = NewSystem.IO.StreamReader(objResponse.GetResponseStream(),System.Text.Encoding.ASCII)
 
  Do While objSR.EndOfStream = False
    strLine = objSR.ReadLine()
    If Regex.IsMatch(strLine, strArgument) Then
      Return "Listed"
      Exit Function
    End If
  Loop
 
  objSR.Close()
  objResponse.Close()
  Return "Not Listed"
 
  Catch f As Exception
  Return "Error:" &f.Message.ToString()
  End Try
End Function

Conclusion

I hope this example helps in your Regular Expressions adventure. Happy coding!

 

Resources

Regular Expression Library

Regular Expression Advice


Product Spotlight
Product Spotlight 

©Copyright 1998-2024 ASPAlliance.com  |  Page Processed at 2024-04-20 12:06:41 AM  AspAlliance Recent Articles RSS Feed
About ASPAlliance | Newsgroups | Advertise | Authors | Email Lists | Feedback | Link To Us | Privacy | Search