FAST SEARCHING IN A BYTEARRAY

For some reason there is no method to search for a specific text in a ByteArray. At least i could not find one. I tried a very straightforward search by simply scanning from every position. Although it works, it is very slow with bigger ByteArrays. A quick look on the internet resulted in an algorithm that suited my needs: the Boyer-Moore-Horspool algorithm. Unfortunately there was no ActionScript alternative available for it yet. So i decided to port it myself and with great result, because it greatly improved the performance of the searches for my use case.

package nl.vanhulzenonline.utils
{
	import flash.utils.ByteArray;

	public class ByteArrayUtils
	{
		public static function getIndexOf(text:String, data:ByteArray, start:int = 0, end:int = -1):int
		{
			var pattern:ByteArray = new ByteArray();
			pattern.writeUTFBytes(text);
			pattern.position = 0;

			if (end == -1)
				end = data.length - 1;

			var i:int;
			var badCharSkip:Array = new Array();

			// initialize the table to default value
			// when a character is encountered that does not occur
			// in the pattern, we can safely skip ahead for the whole
			// length of the pattern.
			for (i = 0; i <= 255; i++)
				badCharSkip[i] = pattern.length;

			// then populate it with the analysis of the pattern
			var endOfPattern:int = pattern.length - 1;
			for (i = 0; i < endOfPattern; i = i + 1)
				badCharSkip[pattern.readUnsignedByte()] = endOfPattern - i;

			// do the matching

			// search the data, while the pattern can still be within it.
			var dataPart:int;
			var endOfData:int = end;
			var dataPosition:int = start;
			while (endOfData >= endOfPattern)
			{
				// scan from the end of the pattern
				i = endOfPattern;
				while(true)
				{
					data.position = dataPosition + i;
					pattern.position = i;
					if (data.readUnsignedByte() == pattern.readUnsignedByte())
					{
						// if the first byte matches, we've found it.
						if (i == 0)
							return dataPosition;
						i--;
					}
					else
						break;
				}

				// otherwise, we need to skip some bytes and start again.
				// note that here we are getting the skip value based on
				// the last byte of pattern, no matter where we didn't
				// match. so if pattern is: "abcd" then we are skipping
				// based on 'd' and that value will be 4, and for "abcdd"
				// we again skip on 'd' but the value will be only 1.
				// the alternative of pretending that the mismatched
				// character was the last character is slower in the normal
 				// case (eg. finding "abcd" in "...azcd..." gives 4 by
				// using 'd' but only 4-2==2 using 'z'.
				data.position = dataPosition + endOfPattern;
				dataPart = data.readUnsignedByte();
				endOfData -= badCharSkip[dataPart];
				dataPosition += badCharSkip[dataPart];
			}
			return -1;
		}
	}
}

저작자표시

'공부방 > Flex' 카테고리의 다른 글

FLEX 소켓통신 테스트 해보기 (0)	2012.02.13
[Flex] Flex 에서 데이터의 사용 (0)	2012.02.13
TitleWindow를 이용한 팝업(Popup)에 대한 간단한 설명과 Tip (0)	2012.02.13
Panel 이나 TitleWindow의 TItleBar에 Button 붙이기 (0)	2012.02.13
Flex 유용한 링크 (0)	2012.02.13

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

일상 기록

FAST SEARCHING IN A BYTEARRAY

'공부방 > Flex' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역

FAST SEARCHING IN A BYTEARRAY

'공부방 > Flex' 카테고리의 다른 글

'공부방/Flex' Related Articles

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역