Finding Text in a txt file
Hi,
I am trying to write a program that saves the html of a webpage to a text file and then displays information from the html in labels etc.
I have figured out how to get the html and then to save it to a .txt file, but i am having trouble in finding a way to read the txt file and display certain bits of information.
For example:
I want to display "Data1" in a label,
<td valign="top"><font color="#FFFF00" face="Small Fonts" style="font-size: 9px">Example1:</td>
<td valign="top"><font color="#FFFFFF" face="Small Fonts" style="font-size: 9px">Data1</td>
Thanks for your help
Mike
[687 byte] By [
mike_w] at [2007-11-11 8:07:39]

# 1 Re: Finding Text in a txt file
so you want your program to ignore the Example1 and get Data1 to a label?
I think that you (if you own the site the HTML comes from) should add a name to the TD containing the Data1 string, so that it is better recognized. Like this:
<td valign="top" name="thisone"><font crap>Data1</font></td>
or you should add a div tag around the text to extract, like this:
<td valign="top"><font crap><div id="thisone">Data1</font></td>
then after that i dont know how to extract it from the txt file, but im sure someone else does xD
# 2 Re: Finding Text in a txt file
Thank you for your response, but unfortunatly i dont own the site.
The section with "Example1" is constant, so i will probably have to search for:
<td valign="top"><font color="#FFFF00" face="Small Fonts" style="font-size: 9px">Example1:</td>
<td valign="top"><font color="#FFFFFF" face="Small Fonts" style="font-size: 9px">
then start writing to the label and then stop when it gets to the next "<"
But how i do that im not sure :S
mike_w at 2007-11-11 21:48:59 >

# 3 Re: Finding Text in a txt file
What language are you using?
# 4 Re: Finding Text in a txt file
sorry im using VB.Net.
mike_w at 2007-11-11 21:51:09 >

# 5 Re: Finding Text in a txt file
If the two table cells are not on separate lines, you can do this:
Imports System.Text.RegularExpressions
Dim exp As New Regex(pattern, _
RegexOptions.IgnoreCase Or RegexOptions.Multiline)
Dim match As Match = exp.Match(text)
If match.Success Then
Data = match.Groups(1).Value
End If
I can't figure out how to get the regular expression to search across multiple lines, so you may have to resort to brute force:
' Find 'Example1'
Dim example1 As Integer = InStr(text, "Example1", CompareMethod.Text)
' Find next <font> tag
Dim fontTag As Integer = InStr(example1, text, "<font ", CompareMethod.Text)
' Find end of <font> tag
Dim endFont As Integer = InStr(fontTag, text, ">")
' Extract text between end of <font>
' and beginning of </td>
Dim endTd As Integer = InStr(endFont, text, "</td", CompareMethod.Text)
Dim Data As String = Mid(text, endFont + 1, endTd - endFont - 1)
# 6 Re: Finding Text in a txt file
Thank you for the example,
I have tried the second example as they are on seperate lines in the text file but i keep getting the following error:
An unhandled exception of type 'System.ArgumentException' occurred in microsoft.visualbasic.dll
Additional information: Argument 'Start' must be greater or equal to zero.
It appears when trying to run this section of the code:
Dim fontTag As Integer = InStr(example1, text, "<font ", CompareMethod.Text)
mike_w at 2007-11-11 21:53:02 >

# 7 Re: Finding Text in a txt file
That means example1 is equal to zero; InStr is not finding the initial text in your input string.
# 8 Re: Finding Text in a txt file
This is the code i am using:
Private Sub btnGo_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnGo.Click
Dim oFile As System.IO.File
Dim oRead As System.IO.StreamReader
oRead = oFile.OpenText("C:\testhtml.txt")
Dim example1 As Integer = InStr(Text, "Example1", CompareMethod.Text)
Dim fontTag As Integer = InStr(example1, Text, "<font ", CompareMethod.Text)
Dim endFont As Integer = InStr(fontTag, Text, ">")
Dim endTd As Integer = InStr(endFont, Text, "</td", CompareMethod.Text)
Dim Data As String = Mid(Text, endFont + 1, endTd - endFont - 1)
lblTest.Text = Data
End Sub
and this is the contents of the text file:
<td valign=""top""><font color=""#FFFF00"" face=""Small Fonts"" style=""font-size: 9px"">Example1:</td>
<td valign=""top""><font color=""#FFFFFF"" face=""Small Fonts"" style=""font-size: 9px"">Data</td>
Is this section of my code wrong to open the file for reading?:
Dim oFile As System.IO.File
Dim oRead As System.IO.StreamReader
oRead = oFile.OpenText("C:\testhtml.txt")
mike_w at 2007-11-11 21:55:08 >

# 9 Re: Finding Text in a txt file
You're opening the file fine, but I don't see where you're actually reading it or assigning anything to the Text variable.
# 10 Re: Finding Text in a txt file
Hi, I just finished a project in C# exactly like you are doing.
What you need is parsing your file by using Regex.Split method, just like
Phil did.
But, you have to find the right pattern to split it.
Here is my code that I used in my project.
Try to replace PATTERN_VAL with:
<td valign=""top""><font color=""#FFFF00"" face=""Small Fonts"" style=""font-size: 9px"">Example1:</td>
<td valign=""top""><font color=""#FFFFFF"" face=""Small Fonts"" style=""font-size: 9px"">
The result should be in string array, and it should be in index 1.
Dim oFile As System.IO.File
Dim oRead As System.IO.StreamReader = oFile.OpenText("C:\testhtml.txt")
Dim fileContent As string = oRead.ReadToEnd()
Regex reg = new Regex(PATTERN_VAL)
Dim result As string() = reg.Split(fileContent)
# 11 Re: Finding Text in a txt file
This same code might have been offered but take a look at this:
'get your data by loading the HTML however you want then assign that text to a variable called: searchtext
Dim SearchText As String = TextBox1.Text
'get rid of all breaks
SearchText = System.Text.RegularExpressions.Regex.Replace(SearchText, vbCrLf, "")
'search for all pieces that match the given expression.
For Each match As System.Text.RegularExpressions.Match In System.Text.RegularExpressions.Regex.Matches(SearchText, "(?<=<td[^>]*?><font[^>]*?>)[^<]*", System.Text.RegularExpressions.RegexOptions.IgnoreCase)
'ignore all example1:'s
If Not match.Value.ToLower = "example1:" Then
'use the value in a label or how ever you need.
MsgBox(match.Value)
End If
Next
AdamP at 2007-11-11 21:58:11 >
