JAVA Application of the Chiang Kai-shek regular expression (2)

  JDK1.4, now finally have their own Regular Expression API packages, JAVA programmer can not find a third party to provide a regular expression for the setbacks, we now find out immediately to provide this belated SUN TU– - to me, this is true. 
  1. Introduction: 
  Java.util.regex is a regular expression is used by the custom mode to match the string work of the class library package. 

  It consists of two categories: Pattern and a Pattern Matcher Pattern is a regular expression is compiled by the performance model. 
  Matcher a Matcher object is a state machine, which targets as a basis for Pattern on pattern matching string commence inspection. 

  The first order of a Pattern examples used by the grammar and PERL similar expressions are then compiled by the model, and then a Matcher examples given in this case the Pattern model under the control of the matching strings. 

  Following we were to look at these two categories: 

  2.Pattern categories: 
  Pattern method as follows: static Pattern compile (String regex) 
  Will be of a regular expression compiler and given to category Pattern 
  Static Pattern compile (String regex, int flags) 
  Ditto, but increase the parameters specified flag, the flag optional parameters include: CASE INSENSITIVE, MULTILINE, DOTALL, UNICODE CASE, CANON EQ 
  Int flags () 
  Back to the current flag Pattern matching parameters. 
  Matcher matcher (CharSequence input) 
  Generate a given object named Matcher 
  Static boolean matches (String regex, CharSequence input) 
  Compiler to be a regular expression and the right to the input string as a regular expression to match mode, the method is suitable for the use of regular expressions will be a situation that is only one match, because Under such circumstances do not need to create a Matcher examples. 
  String pattern () 
  Patter returned to the subject compiled by the regular expression. 
  String [] split (CharSequence input) 
  Pattern will be in accordance with the objectives of the string contains the regular expression for the mode partition. 
  String [] split (CharSequence input, int limit) 
  Ditto role, aims to increase the parameter limit to the number of designated segmentation, such as limi Set 2, then string objectives under the regular expression will be divided into two sections for cutting. 

  Is a regular expression, which is a bunch of specific significance of the characters, first of all, to build a Pattern instance of the class, this will be used matcher Object Pattern () method to generate a Matcher example, and then we can use the examples Matcher Regular expressions are compiled on the basis of the target string matching work, a number of Matcher can share a Pattern object. 

  We first look at a simple example, and through the analysis to understand how it generates a Pattern object and compile a regular expression, in accordance with this being the final expression of the target string is divided: 
  Import java.util.regex .*; 
  (Public class Replacement 
  Public static void main (String [] args) throws Exception ( 
  / / Generate a Pattern, while compiling a regular expression 
  Pattern p = Pattern.compile ("[/]+"); 
  / / Pattern with the split () method by the string "/" Segmentation 
  String [] = p.split result ( 
  "Kevin has seen" LEON "seveal times, because it is a good film." 
  + "/ Kevin has already read" The killer is not too cold "several times, because it is a" 
  + "Good film. / Terms: Kevin."); 
  For (int i = 0; i <result.length; i + +) 
  System.out.println (result [i]); 
  ) 
  ) 

  Output: 

  Kevin has seen "LEON" seveal times, because it is a good film. 
  Kevin has already read "The killer is not too cold" several times, because it is a good film. 
  Log: Kevin. 

  Obviously, the program will be on the string "/" for the sectional, we use the following to split (CharSequence input, int limit) method to specify the number of sub-paragraph, changes in procedures for: 
  Tring result [] = p.split ( "Kevin has seen" LEON "seveal times, because it is a good film. / Kevin has already read" The killer is not too cold "several times, because it is a good film. / terms: Kevin. ", 2); 

  This inside the parameters of "2" indicates that the target language is divided into two sections. 

  Compared with the output: 

  Kevin has seen "LEON" seveal times, because it is a good film. 
  Kevin has already read "The killer is not too cold" several times, because it is a good film.    / Terms: Kevin. 

  From the above example, we can compare the structure Pattern java.util.regex package in the designated target, as well as compile a regular expression and the realization of our approach in the one described in the Jakarta-ORO wrapped in the same work completed difference, Jakarta-ORO construct a first-class object PatternCompiler then generates a Pattern object, then a regular expression with the PatternCompiler category compile () method will be necessary to compile a regular expression Pattern categories: 

  PatternCompiler orocom = new Perl5Compiler (); 

  Pattern pattern = orocom.compile ( "REGULAR EXPRESSIONS"); 

  PatternMatcher matcher = new Perl5Matcher (); 

  But in the java.util.regex package, we just generate a Pattern category, the direct use it compile () method can achieve the same effect: 
  Pattern p = Pattern.compile ("[/]+"); 

  So it seems that the construction method java.util.regex than the Jakarta-ORO more concise and easy to understand. 

  3.Matcher categories: 
  Matcher as follows: Matcher appendReplacement (StringBuffer sb, String replacement) 
  Substring matching the current replacement for a specified string, and will be replaced after the substring, and prior to their last match of the series after a string added to the paragraph in the StringBuffer object. 
  StringBuffer appendTail (StringBuffer sb) 
  Will be the last match remaining after adding to a string object, StringBuffer. 
  Int end () 
  Back to the current match in the final string of characters in the original goal of a string position in the Index. 
  Int end (int group) 
  Back to the pattern specified in the match in the group with a string of characters in the final position. 
  Boolean find () 
  In the target string, try to find a matching substring. 
  Boolean find (int start) 
  Reset Matcher object, and try in the target string, starting from the specified location Find a match in the series. 
  String group () 
  Back to the View Group and acquired all the substring matching content 
  String group (int group) 
  View from the current obtained with the designated group match in the series content 
  Int groupCount () 
  View from the current received by the number of group matches. 
  Boolean lookingAt () 
  Whether the string of goals the match in the series starting. 
  Boolean matches () 
  Characters try to start the whole goal of matching, which is the only goal when the strings perfectly matched to return to the true value. 
  Pattern pattern () 
  Return to the Matcher object of the existing pattern, which is the corresponding Pattern object. 
  String replaceAll (String replacement) 
  The target string, and the existing pattern of the match in the series for all designated replacement string. 
  String replaceFirst (String replacement) 
  Target with a string section in the existing pattern of the match in the replacement string for a specific string. 
  Matcher reset () 
  The re-establishment of the Matcher object. 
  Matcher reset (CharSequence input) 
  The re-establishment of the Matcher object and the goal of the designation of a new string. 
  Int start () 
  Back to the current received by the beginning of the string of characters in the original location of the target string. 
  Int start (int group) 
  Back to the View and received by the designated group match in the first string of characters in the original location of the target string. 

  (Look at the method of interpretation is not very well understood? Buyaoji, will be compared with examples easier to understand.) 

  Matcher is an example of the target string is used to both model-based (that is a given by the Pattern compiled regular expressions) View of the match, all to Matcher input interfaces are provided through CharSequence This is the purpose of support from a wide range of data sources provided by the data matching. 

  We were to look at the methodology: 

  ★ matches () / lookingAt () / find (): 
  Matcher object is a Pattern object of a call from the matcher () method generated, once the Matcher object generated, it can be matched by three different operators: 

  Matches () method to the entire goal of matching characters start, the whole goal is only when the strings perfectly matched to return to the true value. 
  LookingAt () method to detect whether the target string to start the match in the series. 
  Find () method in the target string, try to find a matching substring. 

  These three methods will return a Boolean value to indicate success or failure. 

  ★ replaceAll () / appendReplacement () / appendTail (): 
  Matcher class at the same time provide a four-match series will be replaced by the specified string: 

  ReplaceAll () 
  ReplaceFirst () 
  AppendReplacement () 
  AppendTail () 

  ReplaceAll () and replaceFirst () is the use of relatively simple, see the method explained above.    We find out appendReplacement main focus () and appendTail () method. 

  AppendReplacement (StringBuffer sb, String replacement) will be matched substring of the current replacement for a specified string, and will be replaced after the substring, and prior to their last match of the series after adding to a string of StringBuffer object, and appendTail ( StringBuffer sb) method will be the last match remaining after adding to a string object, StringBuffer. 

  For example, a string fatcatfatcatfat, assuming both the regular expression pattern for the "cat", the first match after the call appendReplacement (sb, "dog"), then when StringBuffer sb fatdog for the content, which is the cat fatcat be replaced with the dog and matched substring sb Add the contents of the former, while the second match after the call appendReplacement (sb, "dog"), then sb fatdogfatdog content, it has become one, if eventually call a appendTail (sb ), then sb content will be the ultimate fatdogfatdogfat. 

  Is still somewhat fuzzy?    Then we look at a simple procedure: 
  / / The cases will sentence "Kelvin" to "Kevin" 
  Import java.util.regex .*; 
  (Public class MatcherTest 
  Public static void main (String [] args) 
  Throws Exception ( 
  / / Object Pattern Formation and the compiler is a simple regular expression "Kelvin" 
  Pattern p = Pattern.compile ( "Kevin"); 
  / / Use of the matcher Pattern () method generates a Matcher object 
  Matcher m = p.matcher ( "Kelvin Li and Kelvin Chan are both working in Kelvin Chen's KelvinSoftShop company"); 
  StringBuffer sb = new StringBuffer (); 
  Int i = 0; 
  / / Find () method by the first match targets 
  Boolean result = m.find (); 
  / / Use of sentence will be all the kelvin then identify and replace the contents, add sb 
  While (result) ( 
  I + +; 
  M.appendReplacement (sb, "Kevin"); 
  System.out.println ( "" + i + "after sb, matching the content:" + sb); 
  / / To find a matching target 
  Result = m.find (); 
  ) 
  / / Last call appendTail () method will be the last match remaining after the string Add sb Lane; 
  M.appendTail (sb); 
  System.out.println ( "Call m.appendTail (sb) is the ultimate content of sb:" + sb.toString ()); 
  ) 
  ) 

  Final output: 
  1st match after sb of the contents: Kevin 
  2nd match after sb of the contents: Kevin Li and Kevin 
  3rd match after sb of the contents: Kevin Li and Kevin Chan are both working in Kevin 
  4th match after sb of the contents: Kevin Li and Kevin Chan are both working in Kevin Chen's Kevin 
  Call m.appendTail (sb) is the ultimate content of sb: Kevin Li and Kevin Chan are both working in Kevin Chen's KevinSoftShop company. 

  Read the above routines whether appendReplacement (), appendTail () the use of two methods more clearly, if still not quite sure to write your own hands a few lines of code to test. 

  ★ group () / group (int group) / groupCount (): 
  The series methods and we introduced in Part 1 of the Jakarta-ORO MatchResult. Group () method similar to (the Jakarta-ORO please refer to the contents of the chapter), and the group is to return to the match in the series, the following code will be good explanation of its use: 
  Import java.util.regex .*; 

  (Public class GroupTest 
  Public static void main (String [] args) 
  Throws Exception ( 
  Pattern p = Pattern.compile ( "(ca) (t)"); 
  Matcher m = p.matcher ( "one cat, two cats in the yard"); 
  StringBuffer sb = new StringBuffer (); 
  Boolean result = m.find (); 
  System.out.println ( "The View, the number of group matches was:" + m.groupCount ()); 
  For (int i = 1; i <= m 
  ) 
  ) 

  Output: 
  View was that the number of group matches: 2 
  Group 1 of the series says: ca 
  Group 2 of the series says: t 

  Matcher object other methods for better understanding and because space is limited, please readers to make their own programming certification. 

  4.    Email addresses a small test procedures: 
  Finally, we look at a routine inspection Email address, the program is used to test an input EMAIL addresses contained in the legality of the characters, although this is not a complete EMAIL address testing procedures, it could not test all possible scenarios , but you can, if necessary, on the basis of an increase in its required functions. 
  Import java.util.regex .*; 
  Public class Email ( 
  Public static void main (String [] args) throws Exception ( 
  String input = args [0]; 
  / / Detection input EMAIL address whether illegal symbols "." Or "@" as a starting characters 
  Pattern p = Pattern.compile ( "^ \. | ^ \ @"); 
  Matcher m = p.matcher (input); 
  If (m 
  / / Test for the "www." As the starting 
  P = Pattern.compile ( "^ www \."); 
  M = p.matcher (input); 
  If (m 
  / / Detection it contains illegal characters 
  P = Pattern.compile ( "[^ A-Za-z0-9 \. \ @ \ -~#]+"); 
  M = p.matcher (input); 
  StringBuffer sb = new StringBuffer (); 
  Boolean result = m.find (); 
  Boolean deletedIllegalChars = false; 
  While (result) ( 
  / / If found illegal characters then set a marker 
  DeletedIllegalChars = true; 
  / / If the inside contains illegal characters such as colon double quotes, and so on, then put their elimination, inside SB Add 
  M.appendReplacement (sb, ""); 
  Result = m.find (); 
  ) 
  M.appendTail (sb); 
  Sb.toString input = (); 
  If (deletedIllegalChars) ( 
  System.out.println ( "imported EMAIL address contains a colon, a comma, and other illegal character, Laws"); 
  System.out.println ( "Your input is:" + args [0]); 
  System.out.println ( "Laws should address the legitimate after similar:" + input); 
  ) 
  ) 
  ) 

  For example, we have a command line and type: @ 163.net java Email www.kevin 

  Then the output results will be as follows: EMAIL address can not use the 'www.' Start 

  If the input EMAIL @ kevin@163.net 

  The output: EMAIL addresses can not be '.' Or '@' character as a start-up 

  When the input: cgjmail # $% @ 163.net 

  Then output: 

  Input EMAIL address contains a colon, a comma, and other illegal characters, please change your current input: cgjmail # $% @ 163.net 
  Laws should address the legitimate after similar: cgjmail@163.net 

  5.    Summary: 
  This paper introduces the jdk1.4.0-beta3 in the regular expression library - java.util.regex the class and its methods, if combined with the one described in the Jakarta-ORO API for comparison, readers will be more easy to grasp the use of the API, of course, the performance of the bank in the years to come continues to expand access to the latest information that the readers to the best time to the web site to understand SUN. 

  6.    1985: 
  Originally planned to write a brief us more to pay the regular expression library more representative works, but feel that since a free and good for the regular expression can be used, why go to paid fee, it is thought that many readers: that it is more interested in understanding the third expression is the Founder of friends can find themselves or to online reference materials, I look at the web site. 

  References 

  With the help of documents java.util.regex 
  Dana Nourie and Mike McCloskey wrote Regular Expressions and the Java ™ Programming Language 
  The third is now needed more resources, as well as regular expression based on the development of their applications, click http://www.meurrens.org/ip-Links/java/regex/index.html 

  About the author Kevin Chen Jia Chen Guangyuan, Shantou University Bachelor of Engineering Electronic Information Engineering, Taiwan Dah Sing Press Zhuhai District Development Department, China, Japan and Korea are now on the use of electronic data JAVA development of electronic dictionaries, and other related items.    By E-mail: cgjmail@163.net to contact him. 

Bookmark it: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Sphinn
  • del.icio.us
  • Google
  • DotNetKicks
  • DZone
  • Furl
  • Netvouz

Tags:

Releated Articles


0 Comments to “JAVA Application of the Chiang Kai-shek regular expression (2)”

No Comments. Send your comment.

Leave a Reply

You must be logged in to post a comment.