Search bit string in file
Hello!
I'm looking for an algorithm which finds all occurences of a bit sequence (e.g., "0001") in a file. This sequence can start at any bit in the file (it is not byte aligned).
I have some ideas of how to approach the problem (1) reading file into unsigned char buffer, 2) defining bit structure, 3) comparing the first 4 bits of the buffer with the bit structure, 4) shifting the char buffer one left, 5) repeat at step 3)) but I'm not sure if this is a good or the right approach?
It would be great to get some input, possibly with sample source code.
Many thanks,
Michael
[628 byte] By [
Daneel] at [2007-11-11 10:21:32]

# 1 Re: Search bit string in file
Hi,
why not using the std::find() algorithm to find the first binary digit in your string
and then "munch up" all digits until you find a non-digit.
something like:
string myString = "lgdha;0110101001dajhglg10101asyhdf001001";
size_t startPos = find_first_of("01");
size_t endPos = find_first_not_of("01",startPos);
string result = myString.substr(startPos,endPos-startPos-1); // should be the first
// substring that represents a binary
Mind you I haven't tested this, but a little investigation will get you started in no time ;-)
Cheers,
D
# 2 Re: Search bit string in file
Hi!
Sorry, I think I did not make myself clear. I'm talking about bit strings / bit sequences here, as in: an integer is a 8 bit type. I'm talking about the binary level, not about finding substrings in strings.
Thanks,
Michael
Daneel at 2007-11-11 21:00:07 >

# 3 Re: Search bit string in file
Sorry misunderstood you there.
But you still can use the STL.
For instance you could translate the bytes of your file into a bit-string (the string will bit 8 times the size of the file. If that is a problem, read your file in chunks).
and then do a find on the string. that shoud do the trick.
Cheers,
D
# 4 Re: Search bit string in file
Sounds interesting, can you give a code example?
Can I also modify the binary file that way, i.e., translate it into a string, manipulate the string, and translate it back into binary?
Thanks,
Michael
Daneel at 2007-11-11 21:02:00 >

# 5 Re: Search bit string in file
Of course you can.
string charToBitString(unsigned char theByte)
{
string reval("00000000");
unsigned char testBit = 128;
for(int i=0;i<8;i++)
{
if((testBit & theByte) == testBit)
reval[i] = '1';
}
return reval; // here reval should hold the bit-pattern
}
...
string fileAsABitString("");
// call in a loop for every byte of the file
fileAsABitString += charToBitString(fileByte);
...
Again, all the spellos read in the right order form an old serbo-kroatian saying, so I leave
the debugging to you...
Cheers,
D