Categories: MSDN / DotNet / Java / Scripts / Linux / PHP Ask - La ask - La Answer

Search bit string in file

Hello!

I'm looking for an algorithm which finds all occurences of a bit sequence (e.g., "0001") in a file. This sequence can start at any bit in the file (it is not byte aligned).

I have some ideas of how to approach the problem (1) reading file into unsigned char buffer, 2) defining bit structure, 3) comparing the first 4 bits of the buffer with the bit structure, 4) shifting the char buffer one left, 5) repeat at step 3)) but I'm not sure if this is a good or the right approach?

It would be great to get some input, possibly with sample source code.

Many thanks,
Michael
[628 byte] By [Daneel] at [2007-11-11 10:21:32]
# 1 Re: Search bit string in file
Hi,
why not using the std::find() algorithm to find the first binary digit in your string
and then "munch up" all digits until you find a non-digit.
something like:

string myString = "lgdha;0110101001dajhglg10101asyhdf001001";
size_t startPos = find_first_of("01");
size_t endPos = find_first_not_of("01",startPos);
string result = myString.substr(startPos,endPos-startPos-1); // should be the first
// substring that represents a binary

Mind you I haven't tested this, but a little investigation will get you started in no time ;-)

Cheers,

D
drkybelk at 2007-11-11 20:59:07 >
# 2 Re: Search bit string in file
Hi!

Sorry, I think I did not make myself clear. I'm talking about bit strings / bit sequences here, as in: an integer is a 8 bit type. I'm talking about the binary level, not about finding substrings in strings.

Thanks,
Michael
Daneel at 2007-11-11 21:00:07 >
# 3 Re: Search bit string in file
Sorry misunderstood you there.
But you still can use the STL.
For instance you could translate the bytes of your file into a bit-string (the string will bit 8 times the size of the file. If that is a problem, read your file in chunks).
and then do a find on the string. that shoud do the trick.
Cheers,
D
drkybelk at 2007-11-11 21:01:01 >
# 4 Re: Search bit string in file
Sounds interesting, can you give a code example?

Can I also modify the binary file that way, i.e., translate it into a string, manipulate the string, and translate it back into binary?

Thanks,
Michael
Daneel at 2007-11-11 21:02:00 >
# 5 Re: Search bit string in file
Of course you can.

string charToBitString(unsigned char theByte)
{
string reval("00000000");
unsigned char testBit = 128;
for(int i=0;i<8;i++)
{
if((testBit & theByte) == testBit)
reval[i] = '1';
}
return reval; // here reval should hold the bit-pattern
}

...
string fileAsABitString("");
// call in a loop for every byte of the file
fileAsABitString += charToBitString(fileByte);

...

Again, all the spellos read in the right order form an old serbo-kroatian saying, so I leave
the debugging to you...

Cheers,

D
drkybelk at 2007-11-11 21:03:10 >