A Regular Expression
defines a search pattern
for strings. The search pattern can be anything from a
simple
character, a fixed
string or a complex expression containing
special characters
describing the pattern. The
pattern
defined by the Regular Expression
may match one or
several times or not at all for a
given
string.
Regular expressions can be used
to search, edit and
manipulate
text. A Regular Expression is also known as a regex or regexp.
The
java.util.regex package was added to Java SE 1.4. If you are running an older version of java than that you should really consider upgrading.
Java Regular Expression classes are present in java.util.regex package that contains three classes: Pattern, Matcher and PatternSyntaxException.
1) Pattern object is the compiled version of the regular
expression. It doesn’t have any public constructor and we use it’s
public static method compile to create the pattern object by passing regular expression argument.
2. Matcher is the regex engine object that matches the
input String pattern with the pattern object created. This class doesn’t
have any public construtor and we get a Matcher object using pattern
object matcher method that takes the input String as argument. We then use matches method that returns boolean result based on input String matches the regex pattern or not.
3) PatternSyntaxException is thrown if the regular expression syntax is not correct.
Java Regular Expression Metacharacters:-
We have some metacharacters also in regular expression, it’s like short codes for common matching patterns.
^ indicates the beginning of line.
$ indicates the end of line.
Regular Expression is in between ^ and $.
Following is the list of metacharacters which can be used in Regular Expressions.
\d - Any digit, short for [0-9].
\D - A non-digit, short for [^0-9].
\s - A white space character.
\S - A non-white space character.
\w - A word character, short for [a-zA-Z_0-9].
\W - A non-word character [^\w].
\b - Matches a word boundary where a word character is [a-zA-Z0-9_].
[..] - Matches any single character in brackets.
[^..] - Matches any single character not in brackets.
\t - Matches a tab (U+0009).
\v - Matches a vertical tab (U+000B).
+ - Matches the preceding character 1 or more times. Equivalent to {1,}.
* - Matches the preceding character 0 or more times. Equivalent to {0,}.
? - Matches the preceding character 0 or 1 time. Equivalent to {0,1}.
. - (The decimal point) matches any single character except the newline character.
Examples:-
Valid Regular Expressions are ,
^\\d+$ - Numerics,
^\\d+$ - Numerics,
^\\w$ - AlphaNumerics,
^[a-z0-9_-]{3,16}$ - lowercase text with numerics,underscore or hyphen and length should be between 3 and 16.
^[a-z0-9_-]{3,16}$ - lowercase text with numerics,underscore or hyphen and length should be between 3 and 16.
^\\d{5}$/ - 5 digit Numerics,
^(\\d{1,2})-(\\d{1,2})-(\\d{4})$/ - Date format dd-MM-yyyy .
Regular Expression Example:-
Regular expressions make it possible to find all instances of text that
match a certain pattern, and return a Boolean value if the pattern is
found/not found. (This can be used to validate input such as phone
numbers, social security numbers, email addresses, web form input data,
scrub data, decimals,Numerics,AlphaNumerics,Email and much more. Eg. If the pattern is found in a String, and
the pattern matches a Numerics, then the string is an Numerics).
import java.util.ArrayList;
import java.util.List;
public class ValidateDemo {
public static void main(String[] args) {
List<String> input = new ArrayList<String>();
input.add("123");
input.add("98HT12");
input.add("345") ;
for (String numeric : input) {
if (numeric.matches("^\\d+$")) {
System.out.println("Numerics : " + numeric);
}
}
}
}
}
}
}
}
Output:-
Numerics : 123
Numerics : 345
Numerics : 123
Numerics : 345
Syntax Error Validation in Java Using Pattern:-
public class RegularExpValidation {
public static void main(String[] args){
String valid = RegExpValidation("^\\w{1,$");
if(valid != null ){
System.out.println(valid);
}
}
public static String RegExpValidation(String regPattern){
String errorMessage = null;
try {
Pattern.compile(regPattern);
}
catch (PatternSyntaxException exception) {
errorMessage = exception.getDescription();
}
return errorMessage;
}
}
Output:-
Illegal repetition
Backslashes in Java with Regular Expressions:-
In literal Java strings the backslash is an escape character. The literal string "\\" is a single backslash. In regular expressions, the backslash is also an escape character. The regular expression \\ matches a single backslash. This regular expression as a Java string, becomes "\\\\". That's right: 4 backslashes to match a single one.
The regex \w matches a word character. As a Java string, this is written as "\\w".