Regular-Expression Examples

Example Description
. Match any character except newline
[Rr]uby Match "Ruby" or "ruby"
rub[ye] Match "ruby" or "rube"
[aeiou] Match any one lowercase vowel
[0-9] Match any digit; same as [0123456789]
[a-z] Match any lowercase ASCII letter
[A-Z] Match any uppercase ASCII letter
[a-zA-Z0-9] Match any of the above
aeiou Match anything other than a lowercase vowel
0-9 Match anything other than a digit
\d Match a digit: [0-9]
\D Match a nondigit: 0-9
\s Match a whitespace character: [ \t\r\n\f]
\S Match nonwhitespace: \t\r\n\f
\w Match a single word character: [A-Za-z0-9_]
\W Match a nonword character: A-Za-z0-9_
ruby? Match "rub" or "ruby": the y is optional
ruby* Match "rub" plus 0 or more ys
ruby+ Match "rub" plus 1 or more ys
\d{3} Match exactly 3 digits
\d{3,} Match 3 or more digits
\d{3,5} Match 3, 4, or 5 digits
\D\d+ No group: + repeats \d
(\D\d)+/ Grouped: + repeats \D\d pair
([Rr]uby(, )?)+ Match "Ruby", "Ruby, ruby, ruby", etc.

http://www.cnblogs.com/createMoMo/archive/2013/05/24/3097519.html

https://github.com/deanwampler/spark-scala-tutorial

to debug/practice:

1. start Spark-shell and enter scala

2. simply run the following:

scala> val r = """(?<=\w)\b"""

r: String = (?<=\w)\b

scala> val items = "a.bold#empty.red".split(r)

items: Array[String] = Array(a, .bold, #empty, .red)

results matching ""

    No results matching ""