extracting numbers whose lengths are fixed from text file












3















I have a text file. This file includes characters and numbers as follows:



ANKR00TUR_R_20183240000_01D_30S_MO.rnx:  2018    11    20    00    00    0.0000000     GPS         TIME OF FIRST OBS
brmu3350.14o: 2014 12 1 0 0 0.0000000 GPS TIME OF FIRST OBS
KNY12040.14o: 2014 7 23 0 0 0.0000000 GPS TIME OF FIRST OBS
rinex_quantity:grep "TIME OF FIRST OBS" * > time_of_first_epochs


I need to extract only 4 digits numbers and store them into another file as follows:



2018
2014
2014


I applied the following code but it extracts all 4 digit numbers:



grep  -Po "d{4}" data

2018
3240
2018
0000
3350
2014
0000
1204
2014
0000









share|improve this question























  • you need to extract the digit number after the colon?

    – AtomiX84
    Jan 14 at 10:17
















3















I have a text file. This file includes characters and numbers as follows:



ANKR00TUR_R_20183240000_01D_30S_MO.rnx:  2018    11    20    00    00    0.0000000     GPS         TIME OF FIRST OBS
brmu3350.14o: 2014 12 1 0 0 0.0000000 GPS TIME OF FIRST OBS
KNY12040.14o: 2014 7 23 0 0 0.0000000 GPS TIME OF FIRST OBS
rinex_quantity:grep "TIME OF FIRST OBS" * > time_of_first_epochs


I need to extract only 4 digits numbers and store them into another file as follows:



2018
2014
2014


I applied the following code but it extracts all 4 digit numbers:



grep  -Po "d{4}" data

2018
3240
2018
0000
3350
2014
0000
1204
2014
0000









share|improve this question























  • you need to extract the digit number after the colon?

    – AtomiX84
    Jan 14 at 10:17














3












3








3








I have a text file. This file includes characters and numbers as follows:



ANKR00TUR_R_20183240000_01D_30S_MO.rnx:  2018    11    20    00    00    0.0000000     GPS         TIME OF FIRST OBS
brmu3350.14o: 2014 12 1 0 0 0.0000000 GPS TIME OF FIRST OBS
KNY12040.14o: 2014 7 23 0 0 0.0000000 GPS TIME OF FIRST OBS
rinex_quantity:grep "TIME OF FIRST OBS" * > time_of_first_epochs


I need to extract only 4 digits numbers and store them into another file as follows:



2018
2014
2014


I applied the following code but it extracts all 4 digit numbers:



grep  -Po "d{4}" data

2018
3240
2018
0000
3350
2014
0000
1204
2014
0000









share|improve this question














I have a text file. This file includes characters and numbers as follows:



ANKR00TUR_R_20183240000_01D_30S_MO.rnx:  2018    11    20    00    00    0.0000000     GPS         TIME OF FIRST OBS
brmu3350.14o: 2014 12 1 0 0 0.0000000 GPS TIME OF FIRST OBS
KNY12040.14o: 2014 7 23 0 0 0.0000000 GPS TIME OF FIRST OBS
rinex_quantity:grep "TIME OF FIRST OBS" * > time_of_first_epochs


I need to extract only 4 digits numbers and store them into another file as follows:



2018
2014
2014


I applied the following code but it extracts all 4 digit numbers:



grep  -Po "d{4}" data

2018
3240
2018
0000
3350
2014
0000
1204
2014
0000






text-processing






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Jan 14 at 10:14









deepblue_86deepblue_86

5851023




5851023













  • you need to extract the digit number after the colon?

    – AtomiX84
    Jan 14 at 10:17



















  • you need to extract the digit number after the colon?

    – AtomiX84
    Jan 14 at 10:17

















you need to extract the digit number after the colon?

– AtomiX84
Jan 14 at 10:17





you need to extract the digit number after the colon?

– AtomiX84
Jan 14 at 10:17










2 Answers
2






active

oldest

votes


















7














Your grep command was almost correct, you just have to anchor the pattern to match only if there is a word boundary before or after it.



Word boundaries are zero-length patterns that match between a word-character (letters, digits, underscore) and a non-word charater (e.g. spaces, other punctuation, line end, and everything else).



In grep, you can either do this by surrounding your pattern with b, or by using the -w switch to enable word matching:



$ grep -Po 'bd{4}b' data
2018
2014
2014

$ grep -Pow 'd{4}' data
2018
2014
2014





share|improve this answer































    0















    with miller (http://johnkerl.org/miller/doc) is



    mlr --implicit-csv-header --pprint  cut -f 2 then label year input


    As output you will have



    year
    2014
    2014


    Mi input is



    brmu3350.14o:  2014    12     1     0     0    0.0000000     GPS         TIME OF FIRST OBS
    KNY12040.14o: 2014 7 23 0 0 0.0000000 GPS TIME OF FIRST OBS


    I have simply extracted the second column with cut






    share|improve this answer























      Your Answer








      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "89"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1109590%2fextracting-numbers-whose-lengths-are-fixed-from-text-file%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      7














      Your grep command was almost correct, you just have to anchor the pattern to match only if there is a word boundary before or after it.



      Word boundaries are zero-length patterns that match between a word-character (letters, digits, underscore) and a non-word charater (e.g. spaces, other punctuation, line end, and everything else).



      In grep, you can either do this by surrounding your pattern with b, or by using the -w switch to enable word matching:



      $ grep -Po 'bd{4}b' data
      2018
      2014
      2014

      $ grep -Pow 'd{4}' data
      2018
      2014
      2014





      share|improve this answer




























        7














        Your grep command was almost correct, you just have to anchor the pattern to match only if there is a word boundary before or after it.



        Word boundaries are zero-length patterns that match between a word-character (letters, digits, underscore) and a non-word charater (e.g. spaces, other punctuation, line end, and everything else).



        In grep, you can either do this by surrounding your pattern with b, or by using the -w switch to enable word matching:



        $ grep -Po 'bd{4}b' data
        2018
        2014
        2014

        $ grep -Pow 'd{4}' data
        2018
        2014
        2014





        share|improve this answer


























          7












          7








          7







          Your grep command was almost correct, you just have to anchor the pattern to match only if there is a word boundary before or after it.



          Word boundaries are zero-length patterns that match between a word-character (letters, digits, underscore) and a non-word charater (e.g. spaces, other punctuation, line end, and everything else).



          In grep, you can either do this by surrounding your pattern with b, or by using the -w switch to enable word matching:



          $ grep -Po 'bd{4}b' data
          2018
          2014
          2014

          $ grep -Pow 'd{4}' data
          2018
          2014
          2014





          share|improve this answer













          Your grep command was almost correct, you just have to anchor the pattern to match only if there is a word boundary before or after it.



          Word boundaries are zero-length patterns that match between a word-character (letters, digits, underscore) and a non-word charater (e.g. spaces, other punctuation, line end, and everything else).



          In grep, you can either do this by surrounding your pattern with b, or by using the -w switch to enable word matching:



          $ grep -Po 'bd{4}b' data
          2018
          2014
          2014

          $ grep -Pow 'd{4}' data
          2018
          2014
          2014






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jan 14 at 10:45









          Byte CommanderByte Commander

          64.1k27176295




          64.1k27176295

























              0















              with miller (http://johnkerl.org/miller/doc) is



              mlr --implicit-csv-header --pprint  cut -f 2 then label year input


              As output you will have



              year
              2014
              2014


              Mi input is



              brmu3350.14o:  2014    12     1     0     0    0.0000000     GPS         TIME OF FIRST OBS
              KNY12040.14o: 2014 7 23 0 0 0.0000000 GPS TIME OF FIRST OBS


              I have simply extracted the second column with cut






              share|improve this answer




























                0















                with miller (http://johnkerl.org/miller/doc) is



                mlr --implicit-csv-header --pprint  cut -f 2 then label year input


                As output you will have



                year
                2014
                2014


                Mi input is



                brmu3350.14o:  2014    12     1     0     0    0.0000000     GPS         TIME OF FIRST OBS
                KNY12040.14o: 2014 7 23 0 0 0.0000000 GPS TIME OF FIRST OBS


                I have simply extracted the second column with cut






                share|improve this answer


























                  0












                  0








                  0








                  with miller (http://johnkerl.org/miller/doc) is



                  mlr --implicit-csv-header --pprint  cut -f 2 then label year input


                  As output you will have



                  year
                  2014
                  2014


                  Mi input is



                  brmu3350.14o:  2014    12     1     0     0    0.0000000     GPS         TIME OF FIRST OBS
                  KNY12040.14o: 2014 7 23 0 0 0.0000000 GPS TIME OF FIRST OBS


                  I have simply extracted the second column with cut






                  share|improve this answer














                  with miller (http://johnkerl.org/miller/doc) is



                  mlr --implicit-csv-header --pprint  cut -f 2 then label year input


                  As output you will have



                  year
                  2014
                  2014


                  Mi input is



                  brmu3350.14o:  2014    12     1     0     0    0.0000000     GPS         TIME OF FIRST OBS
                  KNY12040.14o: 2014 7 23 0 0 0.0000000 GPS TIME OF FIRST OBS


                  I have simply extracted the second column with cut







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Jan 14 at 10:47









                  aborrusoaborruso

                  1714




                  1714






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Ask Ubuntu!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1109590%2fextracting-numbers-whose-lengths-are-fixed-from-text-file%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Human spaceflight

                      Can not write log (Is /dev/pts mounted?) - openpty in Ubuntu-on-Windows?

                      File:DeusFollowingSea.jpg