CodeSnip: Virtual Web Services Through Pattern Matching
 
Published: 23 Nov 2005
Unedited - Community Contributed
Abstract
In this code snippet, Rajesh shows how we can make a static Web site into a virtual Web service. HTML pattern matching is used to implement this solution along with a WSDL file.
by Rajesh Toleti
Feedback
Average Rating: 
Views (Total / Last 10 Days): 22719/ 26

Introduction

Let us assume a situation in which a website is providing a free web service of weather reports. We can consume that service and display these weather reports in our website. One fine day, they have had enough and stop the web service permanently. They are still displaying the weather report in their website. You want to capture that information and display it in your website.

Now this virtual web service comes into the picture. Even though they are not providing a web service, you can make their website itself into a virtual web service. To implement this kind of solution, you need a basic understanding of WSDL Pattern Matching apart from usual web service consuming procedures in .net.

Legal Disclaimer

Before you implement this solution, you should obtain permission from the website from which you intend to extract information. I am going to explain this concept by using an example from http://www.taryatechnologies.com/. You are also free to use this website.

Scenario

Assume you would like to show the services in the page http://www.taryatechnologies.com/aboutus.asp in your web page as your own services.

We achieve this in three steps.

Step 1: Create a WSDL File

Step 2: Build a Proxy Class With the Above WSDL File

Step 3: Write Code for Consuming the Web Service

 

Step 1: Create a WSDL File

Listing 1:WSDL File

<?xml version="1.0" encoding="utf-8"?>

<wsdl:definitions

xmlns:s="http://www.w3.org/2001/XMLSchema"

xmlns:http="http://schemas.xmlsoap.org/wsdl/http/"

xmlns:mime="http://schemas.xmlsoap.org/wsdl/mime/"

xmlns:tm="http://microsoft.com/wsdl/mime/textMatching/"

xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"

xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"

xmlns:s0="http://www.taryatechnologies.com"

targetNamespace="http://www.taryatechnologies.com"

xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/">

<wsdl:types/>

<wsdl:message name="msgHttpGetIn" />

<wsdl:message name="msgHttpGetOut" />

<wsdl:portType name="ptypeHttpGet">

<wsdl:operation name="GetTaryaServices">

<wsdl:input message="s0:msgHttpGetIn"/>

<wsdl:output message="s0:msgHttpGetOut"/>

</wsdl:operation>

</wsdl:portType>

<wsdl:binding name="bindHttpGet"

type="s0:ptypeHttpGet">

<http:binding verb="GET"/>

<wsdl:operation name="GetTaryaServices">

<http:operation location="/aboutus.asp"/>

<wsdl:input>

<http:urlEncoded/>

</wsdl:input>

<wsdl:output>

<tm:text>

<tm:match

name='myServices'

pattern='&lt;ul&gt;(.*?)ul&gt;'

ignoreCase='true'

repeats='100' />

 

</tm:text>

</wsdl:output>

</wsdl:operation>

</wsdl:binding>

<wsdl:service name="TaryaService">

<wsdl:port

name="ptypeHttpGet"

binding="s0:bindHttpGet">

<http:address location="http://www.taryatechnologies.com" />

</wsdl:port>

</wsdl:service>

</wsdl:definitions>

 

I am not going to explain in detail about a WSDL file as it is outside the scope of this article. You can reference that information at http://www.w3.org/TR/wsdl

In the above file, the values in bold are the variables, which you have to change when you create your own WSDL file. I explain them below.

Listing 2:

xmlns:s0="http://www.taryatechnologies.com"

targetNamespace="http://www.taryatechnologies.com"


You have to specify the URL of the website from which you extract information.

Listing 3:

<wsdl:operation name="GetTaryaServices">


This is the method name. It can be anything you fancy. You use it later in the code (for consuming web service).

Listing 4:

<http:operation location="/aboutus.asp"/>


This is the relative path to the specific file from which you extract information.

Listing 5:

<tm:match

name='myServices'

pattern='&lt;ul&gt;(.*?)ul&gt;'

ignoreCase='true'

repeats='100' />


This is the most important part of the WSDL file. The name of the match element can be anything. The value of the pattern gives the actual content from the website. You need to be skillful while writing this expression. There are good sources on the Net to learn pattern matching. One of them is available at http://www.evolt.org/node/22700.

Before writing expression, you need to define what exactly you want from the website. You have to see the HTML source of the web page from which you want extract information. In our example I want to extract the services offered by Taryatechnologies. View the HTML code for www.taryatechnologies.com/aboutus.asp. The piece of information what we want is as follows (in HTML ).

<ul>

<li>Web Site Development</li>

<li>Web Applications</li>

<li>Web services</li>

<li>Graphical Designs</li>

<li>Mobile Applications</li>

<li>Digital Signage solutions</li>

</ul>


We want the information between tags <ul> and </ul>. So our regular expression will be: &lt;ul&gt;(.*?)ul&gt;

The same result can be obtained by different expressions.

() Used to group sequences of matches.
. Matches any character except new line.
* Matches zero or more times.
? Matches zero or one time.

 

Step 2: Build a Proxy Class With the AboveWSDL File

Now we have to create a vb.net/c# file using the above WSDL file. Save the above WSDL files as Taryaservices.wsdl. Copy the WSDL file to the directory where you have your WSDL.exe; from the command prompt type the following to generate a vb.net file:

wsdl /l:vb Taryaservices.wsdl

Then it creates a vb.net file with name TaryaService.vb. This name is taken automatically from the WSDL service name. Now copy this file to the directory where you have the Visual Basic .NET Command Line Compiler (vbc.exe). Type the following from command prompt:

vbc / t:library /r:System.dll,System.Web.Services.dll,System.Xml.dll TaryaService.vb

 

Now that you have created a proxy class with name TaryaService.dll, copy this to your application /bin directory.

 

Step 3: Write Code for Consuming the Web Service

I created a C# file to view the extracted information.

Listing 1:TaryaServiceCS File

<%@ Page Language="C#" Debug="true" %>

 

<script runat="Server">

void Page_Load()

{

TaryaService objTaryaService;

GetTaryaServicesMatches objMatches;

 

try

{

objTaryaService = new TaryaService();

objTaryaService.Timeout = 2000;

objMatches = objTaryaService.GetTaryaServices();

lblTitles.Text = objMatches.myServices[0];

}

catch (Exception e)

{

lblTitles.Text = e.Message;

}

 

}

</script>

 

<html>

<head>

<title>TestTaryaService.aspx</title>

</head>

<body>

<h3>

My Own Services</h3>

<font color="blue">

<asp:Label ID="lblTitles" runat="Server" />

</font>

</body>

</html>

 

We have a label in this file that shows the extracted information. On the Page Load event, we created TaryaService and the GetTaryaServicesMatches objects. objMatches is instantiated by calling GetTaryaServices on objTaryaService. Then we call the property myServices[0] on objMatches. Please note that we passed 0 as the argument to myServices. Here we are assuming we will get only one set of information.

In the following example, I replaced ul element with il:

&lt;li&gt;(.*?)li&gt;

 

In this case, we might get six elements; then we might have iterated from 0 to 5. We wrapped our code in a try-catch block in case we encounter any issues. When you execute everything properly, you should be able to see this, which wraps up this code snippet. Thanks for reading!



User Comments

Title: great   
Name: Saif
Date: 2006-04-10 11:01:51 AM
Comment:
Thank you
Title: Good article   
Name: Chandan
Date: 2005-12-13 4:27:01 AM
Comment:
The article is good, and it gives a very good description about the virtual web service
Title: Great article   
Name: Kay Lee
Date: 2005-11-29 10:01:14 PM
Comment:
This is a great artice, and I like that you wrote it up. I just wanted point out that this capability may or may not become legally irritating. If the site you're leeching is considered a service for sale type of site, it can become a serious problem.

For the readers, please be discrete and courteous in regards to creating Virtual Web Services.
Title: Virtual Web Services Cannot Be Blocked   
Name: Deavon
Date: 2005-11-29 8:19:07 AM
Comment:
Steve Burch:

Unfortunately, there is no way to block a virtual web service; unless you block the content from the end user completely. Anything that can be represented by XML, HTML, RSS, or any other form of non-encrypted data storage can be reparsed and reused.

It is a similar concept to copy and pasting; and then reformatting with customized style sheets; except that, in the form of a VWS, that kind of functionality occurs automatically through compiled processing and parsing of the source file; to generate a SOAP file as the result.

Same data; different way of showing it.

Utilizing .NET's WSDL and Web Service architecture; it is easy to expose the data as a WSDL function; which is exactly what is occuring here.
Title: Virtual web service -- blocking?   
Name: steve burch
Date: 2005-11-27 5:35:01 PM
Comment:
you should show, if possible, how a web site can block someone from doing this.
Title: scan and read scriptedinfo   
Name: case
Date: 2005-11-25 1:54:18 AM
Comment:
How do you read a letter that's in scriped.
Title: Good Article   
Name: Pavan K
Date: 2005-11-23 5:44:54 AM
Comment:
Its a great article, as it gives complete insight into virtual web service. Its a highly recommended reading.






Community Advice: ASP | SQL | XML | Regular Expressions | Windows


©Copyright 1998-2024 ASPAlliance.com  |  Page Processed at 2024-04-20 12:00:53 AM  AspAlliance Recent Articles RSS Feed
About ASPAlliance | Newsgroups | Advertise | Authors | Email Lists | Feedback | Link To Us | Privacy | Search