CodeSnip: Using the IsMatch method in Regular Expressions to screen scrape a webpage
page 1 of 1
Published: 21 Apr 2006
Unedited - Community Contributed
In this article, Steve demonstrates the usage of IsMatch method in Regular Expressions to screen scrape a webpage.
by Web Team at ORCS Web
Average Rating: This article has not yet been rated.
Views (Total / Last 10 Days): 8195/ 18

This code-tip I discovered while developing a webservice to "screen scrape" a webpage to determine if a certain text phrase was present.  Regular expressions are best suited for achieving this task; however, they are not the easiest to learn. 

The System.Text.RegularExpressions namespace in .NET 2.0 has a handy function called IsMatch that achieves what I wanted. The code snippet below accepts two arguments (the URL to monitor, the Text to search for), makes an HTTP request and reads the webpage into a stream.  The stream is searched for the text passed into the method. The one thing I discovered while using the 'IsMatch' method is that the text is case and space sensitive.  For example, if you are searching "" for text in the title of the page, searching for "IIS Logs -" is the exact phrase that would be searched for.

Listing 1

Public Function URLListed(ByVal URL As String, ByValstrArgument As StringAs String
  Dim blnListed As String
  blnListed = readWebPage(URL, strArgument)
  Return blnListed
End Function

Private Function readWebPage(ByVal strSource As StringByVal strArgument AsString) As String
  Dim strLine As String
  Dim objSR As System.IO.StreamReader = Nothing
  Dim objResponse As WebResponse = Nothing
  Dim objRequest As WebRequest =System.Net.HttpWebRequest.Create(strSource)
  objResponse = objRequest.GetResponse
  objSR = NewSystem.IO.StreamReader(objResponse.GetResponseStream(),System.Text.Encoding.ASCII)
  Do While objSR.EndOfStream = False
    strLine = objSR.ReadLine()
    If Regex.IsMatch(strLine, strArgument) Then
      Return "Listed"
      Exit Function
    End If
  Return "Not Listed"
  Catch f As Exception
  Return "Error:" &f.Message.ToString()
  End Try
End Function


I hope this example helps in your Regular Expressions adventure. Happy coding!



Regular Expression Library

Regular Expression Advice

User Comments

No comments posted yet.

Product Spotlight
Product Spotlight 

Community Advice: ASP | SQL | XML | Regular Expressions | Windows

©Copyright 1998-2024  |  Page Processed at 2024-06-17 7:55:39 PM  AspAlliance Recent Articles RSS Feed
About ASPAlliance | Newsgroups | Advertise | Authors | Email Lists | Feedback | Link To Us | Privacy | Search