SHUNMIAN

1 Prefix String Search: Tries
2 Substring Search
3 正则表达式
4 总结
5 参考资料

1 Prefix String Search: Tries

Tries: 又称前缀树或字典树，是一种有序树，用于保存关联数组，其中的键通常是字符串。与二叉查找树不同，键不是直接保存在节点中，而是由节点在树中的位置决定。一个节点的所有子孙都有相同的前缀，也就是这个节点对应的字符串，而根节点对应空字符串。一般情况下，不是所有的节点都有对应的值，只有叶子节点和部分内部节点所对应的键才有相关的值。

1.1 R-way Tries

R-way Tries: R元树*是Tries的一种具体实现形式。每一个node：有1个node数组，指向R(Radix)个结点指针(指向下一个结点或者null)；以及value(有值或null)。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36 public class TrieST<Value>{
	
	private static final int R = 256;
	private Node root = new Node();
	private static class Node {
		private Object value;
		private Node[] next = new Node[R];
	}
	
	public void put(String key, Value val){ 
		root = put(root, key, val, 0); 
	}
	private Node put(Node x, String key, Value val, int d){
		if (x == null) x = new Node();
		if (d == key.length()) { x.value = val; return x; }
		char c = key.charAt(d);
		x.next[c] = put(x.next[c], key, val, d+1);
		return x;
	}
	
	public boolean contains(String key){ 
		return get(key) != null; 
	}

	 public Value get(String key){
		 Node x = get(root, key, 0);
		 if (x == null) return null;
		 return (Value) x.value;
	 }
	 private Node get(Node x, String key, int d){
		 if (x == null) return null;
		 if (d == key.length()) return x;
		 char c = key.charAt(d);
		 return get(x.next[c], key, d+1);
	 }
}

1.2 Ternary Search Tries

Ternary Search Tries: 三元搜索树是前缀树的另一种一种实现，树的各个节点之间的结构类似二叉搜索树。三元搜索数比R元树更节省空间(每个节点存储了一个字符、一个值对象或值指针以及三个指向子节点的指针。这三个字节点常被称为等位字节点、低位字节点和高位字节点。)，但是牺牲了部分查找速度。三元搜索树常用于实现拼写检查和自动完成功能。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40 public class TST<Value>{
	
	private Node root;
	private class Node{
		 private Value val;
		 private char c;
		 private Node left, mid, right;
	}
 
	public void put(String key, Value val){ 
		root = put(root, key, val, 0); 
	}
 	private Node put(Node x, String key, Value val, int d){
 		char c = key.charAt(d);
 		if (x == null) { x = new Node(); x.c = c; }
 		if (c < x.c) x.left = put(x.left, key, val, d);
 		else if (c > x.c) x.right = put(x.right, key, val, d);
 		else if (d < key.length() - 1) x.mid = put(x.mid, key, val, d+1);
 		else x.val = val;
 		return x;
 	}
 	
 	public boolean contains(String key){ 
 		return get(key) != null; 
 	}

 	public Value get(String key){
 		Node x = get(root, key, 0);
 		if (x == null) return null;
 		return x.val;
 	}
 	private Node get(Node x, String key, int d){
 		if (x == null) return null;
 		char c = key.charAt(d);
 		if (c < x.c) return get(x.left, key, d);
 		else if (c > x.c) return get(x.right, key, d);
 		else if (d < key.length() - 1) return get(x.mid, key, d+1);
 		else return x;
 	}
 }

1.3 Character-Based Operations

Ternary Search Trie vs Trie & Hash

1.4 Tries Summary

Ternary Search Trie vs Trie & Hash

2 Substring Search

2.1 Knuth-Morris-Pratt

2.2 Boyer-Moore

2.3 Rabin-Karp

3 正则表达式

3.1 Regular Expressions

3.2 REs and NFAs

3.3 NFA simulation

3.4 NFA construction

3.5 Application

4 总结

5 参考资料

-01 Algorithm (15)

Graph (9)

Twitter Facebook Google+

Shunmian

The only programmers in a position to see all the differences in power between the various languages are those who understand the most powerful one.