• Batch Script – Extract Text Line From Text Files

    Home » Forums » Developers, developers, developers » DevOps Lounge » Batch Script – Extract Text Line From Text Files

    Author
    Topic
    #504699

    Folks,

    hope all are good.:)

    Can a batch expert help me with this batch file.

    I simply wanted to extract some text from text files

    Begin sampletext End >> file1.txt

    Begin anotherlineoftext End >> file2.txt

    Out put these text strings to a new text file

    I wanted to run it on my folder of text files.

    I found this – but still confused

    http://stackoverflow.com/questions/18069790/batch-script-to-extract-lines-between-two-specified-lines

    Code:
    @ECHO OFF
    SETLOCAL 
    
    
    SET "sourcedir=c:sourcedir"    '  current folder
    
    
    SET "destdir=c:destdir"      
    
    
    for /f "tokens=1 delims=[]" %%a in ('find /n "'Parent|Child"^<"%sourcedir%Cost_Center.txt" ') do set /a start=%%a
    
    
    for /f "tokens=1 delims=[]" %%a in ('find /n "!PropertyArray= Begin
    
    Center"^<"%sourcedir%    folder files   " ') do set /a end=%%a
    (
    for /f "tokens=1* delims=[]" %%a in ('find /n /v ""^"%destdir%newfile.txt"
    GOTO :EOF
    
    

    I don’t know how to fix this, it seems really complicated now.

    I have very basic experience of cmd and batch filing – so any advice really appreciated

    many thanks for your advice 😀

    pb

    Viewing 14 reply threads
    Author
    Replies
    • #1553994

      PB,

      This is the kind of thing that PowerShell handles very well and very easily. If you could provide some sample data and a template for exactly what you are trying to extract I’d be glad to work up a PS script to accomplish it.

      HTH :cheers:

      May the Forces of good computing be with you!

      RG

      PowerShell & VBA Rule!
      Computer Specs

    • #1554006

      pb89,

      Your problem setup isn’t clear. Your statements:

      I simply wanted to extract some text from text files

      and

      Out put these text strings to a new text file

      These make sense as they stand. However between them you muddied the waters with:

      Begin sampletext End >> file1.txt

      Begin anotherlineoftext End >> file2.txt

      Uh, what? You see I can read this as a pseudocode example of the problem setup. Or I can read this as actual code, because this is (largely) correct command syntax. In which case such code doesn’t reflect the other statements you made.

      However, let’s ignore your pseudocode/real code statements and just go with the other statements. Why not simply code the following command:

      copy Combined.out + *.txt

      This takes all the text files and appends them together into a single output file called “Combined.out”. I changed the file extension on the new file because I don’t want recursive file copies, locking problems and any other bad things that might happen if the input and output file extensions were the same.

    • #1554008

      Folks,

      my apologies, yes this is a complex problem – my brain has fried.:D:

      Below a sample folder will have many text files.

      43746-image

      Each Text file has a specific string I need to extract – the string is between 2 place holders.

      Lets call them Begin StringToExtract END

      I simply wanted to loop through all the text files – and extract each string between the placeholders.

      Then Output the Strings to a single text file.

      I’ve been testing too many batch files unsuccessfully – so Ive passed it over to you good experts 🙂

      Please let me know if this makes sense.

      thanks ever so much

      pb

    • #1554075

      BP,

      Do all the matches from all the files go to a single file or is it each file’s matches to a different file? :cheers:

      May the Forces of good computing be with you!

      RG

      PowerShell & VBA Rule!
      Computer Specs

    • #1554141

      PB,

      Ok here’s a PS script that should to the trick.

      Code:
      Param (
      	[Parameter(Mandatory=$true)] 
      	   [string] $FullyQualifiedPath
      )
      
      #Note: Make sure the following path is outside those being searched!
      $OutputFile = "G:TestResultsBegEndResults.txt"  #Adjust as needed
      
      $Pat1 = [regex] '(Begin)*(End)'  #Adjust tags between () as appropriate
      
      Remove-Item "$OutputFile"
      
      Get-ChildItem -Path "$FullyQualifiedPath" |
      
      Get-Content  | Select-String -Pattern $Pat1 -AllMatches >> "$OutputFile"
      

      If you haven’t used PS before see this post #1-3.

      Test run:

      Code:
      cmdlet  at command pipeline position 1
      Supply values for the following parameters:
      FullyQualifiedPath: [COLOR="#0000FF"]G:TestSearch*.txt[/COLOR]
      

      Note: If you don’t supply the drivepathfilepattern it will prompt you as above.

      Test File1:

      Code:
      This is a test
      Begin Copy Line 1 End
      Do Not copy
      Do Not Copy
      Begin Copy Line 2 End
      Do Not Copy
      

      Test File 2:

      Code:
      This is a test
      Begin Copy Line 3 End
      Do Not copy
      Do Not Copy
      Begin Copy Line 4 End
      Do Not Copy
      

      Results File

      Code:
      Begin Copy Line 1 End
      Begin Copy Line 2 End
      Begin Copy Line 3 End
      Begin Copy Line 4 End
      

      Of course, this is base code and could be prettied up with prompts/directory pickers/and various other bells and whistles.

      HTH :cheers:

      May the Forces of good computing be with you!

      RG

      PowerShell & VBA Rule!
      Computer Specs

    • #1554147

      RG,

      thank you very much for the great work.

      I have to power up the powershell now and take it baby steps:D

      I have used it before a few times – I know we are meant to use the power shell more – but remember the problem i had with the Visual studio:eek:

      and it always tends to scare me now

      I will refresh my notes.

      I will post back in a bit

      pb

    • #1554155

      I have another method in mind, however my solution isn’t complete. RG has provided a complete solution so I’ll defer to him.

    • #1554158

      Hi RG,

      I get the output file – but its empty:(

      This is the folder I pointed to

      C:UsersPBLDesktopTest* .txt

      $OutputFile = “C:UsersPBLDesktopResults.txt” #Adjust as needed

      I hope its not something to do with the security settings again.

      pb

    • #1554180

      PB,

      Could you post one of your files or at least a line from your file that should match?

      Have you modified the [RegEx] pattern?

      :cheers:

      May the Forces of good computing be with you!

      RG

      PowerShell & VBA Rule!
      Computer Specs

    • #1554183

      Hi RG,

      to keep it simple I am only testing on 2 files

      File 1

      Code:
      This is a test
      Begin John End
      Do Not copy
      Do Not Copy
      Begin Copy Line 1 End
      Do Not Copy
      
      

      File 2

      Code:
      This is a test
      Begin Sarah End
      Do Not copy
      Do Not Copy
      Begin Copy Line 2 End
      Do Not Copy
      
      

      Just duplicated your files.

      Was i mean to change the regex line – I am really bad with regex – so I did not touch it

      PB

    • #1554184

      PB,

      Tested with your contents got results

      Code:
      Begin John End
      Begin Copy Line 1 End
      Begin Sarah End
      Begin Copy Line 2 End
      

      Are you sure you’re entering the file path correctly….NO SPACES or else Enclosed in quotes. There should NOT be a space between the * and .

      Off to play tennis so I’ll be gone for about 2.5 hours.

      HTH :cheers:

      May the Forces of good computing be with you!

      RG

      PowerShell & VBA Rule!
      Computer Specs

    • #1554192

      RG,

      that Power-shell is like a spoilt cry baby, needs a good old fashioned stick or something to ship shape it in order.

      Well my newbie lack of skills are to blame too;)

      Right here is the problem – the path name

      Before – No
      C:UsersPBLDesktopTest* .txt x

      After -Yes

      must be C:UsersPBLDesktopTest*.txt < Yes

      The backslash is needed

      You have to be very careful when copy and pasting path names – learned the hard way , been climbing the walls – couldn't figure out.

      I ended up changing the remote settings and aghhhhh

      Thanks for persevering with me RG, this is solved but let me play and do some notes

      :cheers:

      pb

    • #1554222

      PB,

      Ok here’s the fully parametrized version.

      You can either call it passing parameters

      Code:
      PS> ."Regex Select Strings to FileV3.ps1" -Sourcepath G:Test -FileFilter "sea*.txt" -DestPath G:TestResults -DestFName Results -MyRegex '([Bb]egin)*([Ee]nd)'
      

      OR

      Code:
      PS> ."Regex Select Strings to FileV3.ps1"
      

      and you will be prompted, using graphical UI elements, for all the necessary items. You can also provide any items you want and you will be prompted for the others and it you provide bad paths you will be prompted to select new ones.

      Code:
      Param (
      	[Parameter(Mandatory=$False)] 
      	   [string] $SourcePath,
      	[Parameter(Mandatory=$False)] 
      	   [string] $FileFilter,
              [Parameter(Mandatory=$False)]
                 [string] $DestPath,
              [Parameter(Mandatory=$False)]
                 [string] $DestFName,
              [Parameter(Mandatory=$False)]
                 [string] $MyRegEx 
      )
      
      Function Get-Folder() {
      
        Param (
      	[Parameter(Mandatory=$False)] 
      	   [string] $Prompt
        )
      
        $FolderBrowser = new-object system.windows.Forms.FolderBrowserDialog
        $FolderBrowser.RootFolder = [System.Environment+SpecialFolder]'MyComputer'
        $FolderBrowser.ShowNewFolderButton = $false
        $FolderBrowser.selectedPath = "C:"
        $FolderBrowser.Description = $Prompt
      
        $Status = $FolderBrowser.ShowDialog()
        $FolderBrowser.SelectedPath  #Returned DrivePath or "Cancel"
      
      } #End Get-Folder
      
      Function Show-Msg {
        Param ( [Parameter(Mandatory=$True, 
                 HelpMessage="Message box content.")]
                  [string]$Msg ,
                [Parameter(Mandatory=$False,
                 HelpMessage="Message box title.")]
                  [string]$Title = "Information"
              )          
      [Windows.Forms.MessageBox]::Show("$Msg", "$Title", 
           [Windows.Forms.MessageBoxButtons]::OK , 
           [Windows.Forms.MessageBoxIcon]::Information) 
      
      }  #End Function Show-Msg
      
      # add a helper
      $showWindowAsync = Add-Type –memberDefinition @” 
      [DllImport("user32.dll")] 
      public static extern bool ShowWindowAsync(IntPtr hWnd, int nCmdShow); 
      “@ -name “Win32ShowWindowAsync” -namespace Win32Functions –passThru
      
      function Show-PowerShell() { 
           [void]$showWindowAsync::ShowWindowAsync((Get-Process –id $pid).MainWindowHandle, 10) 
      }
      
      function Hide-PowerShell() { 
          [void]$showWindowAsync::ShowWindowAsync((Get-Process –id $pid).MainWindowHandle, 2) 
      }
      
      #--- Main Program ---
      
      Add-Type -AssemblyName System.Windows.Forms
      Clear-Host
      Hide-PowerShell
      
      If ($SourcePath -eq "" -or (-not (Test-Path -Path $SourcePath))) { 
          $SourcePath = Get-Folder -Prompt "Select Source File Folder"
      }
      
      If ($SourcePath -eq "Cancel") {
          Show-Msg -Msg "User Pressed Cancel" -Title "Program Exit" 
          Exit
      }
           
      If ($FileFilter -eq "") {
        $InputBox = [Microsoft.VisualBasic.Interaction]
        $Prompt   = "Enter file filter pattern:"
        $Title    = "Filter Pattern"
        $Default  = "*.txt"
        $FileFilter = $Inputbox::Inputbox($Prompt, $Title, $Default)
      }
      
      If ($FileFilter -eq "Cancel") {
          Show-Msg -Msg "User Pressed Cancel" -Title "Program Exit" 
          Exit
      }  
      
      If ($DestPath -eq "" -or (-not (Test-Path -Path $DestPath))) { 
          $DestPath = Get-Folder "Select Results Destination Folder"
      }
      
      If ($DestPath -eq "Cancel") {
          Show-Msg -Msg "User Pressed Cancel" -Title "Program Exit" 
          Exit
      }
      
      If ($DestFName -eq "") {
        $InputBox = [Microsoft.VisualBasic.Interaction]
        $Prompt   = "Enter Destination File Name:"
        $Title    = "Destination File Name"
        $Default  = "Results"
        $DestFName = $Inputbox::Inputbox($Prompt, $Title, $Default)
      }
      
      If ($DestFName -eq "") {
          Show-Msg -Msg "User Pressed Cancel" -Title "Program Exit" 
          Exit
      }
      
      If ($MyRegEx -eq "") {
        $InputBox = [Microsoft.VisualBasic.Interaction]
        $Prompt   = "Enter Regular Expression:"
        $Title    = "Regular Expression Input:"
        $Default  = "([Bb]egin)*([Ee]nd)"
        $MyRegEx = $Inputbox::Inputbox($Prompt, $Title, $Default)
      }
      
      If ($MyRegEx -eq "") {
          Show-Msg -Msg "User Pressed Cancel" -Title "Program Exit" 
          Exit
      }
      
      If (Test-Path -Path "$DestPath$DestFName.txt") { 
        Remove-Item "$DestPath$DestFName.txt" 
      }
      
      Get-ChildItem -Path $SourcePath -Filter "$FileFilter" |
      
      Get-Content  | Select-String -Pattern $MyRegEx `
                                   -AllMatches >> "$DestPath$DestFName.txt"
      
      Show-Powershell
      

      BTW: When you create the file from the code above a shorter name w/o spaces makes it easier to enter and you won’t need the quotes around it.

      HTH :cheers:

      May the Forces of good computing be with you!

      RG

      PowerShell & VBA Rule!
      Computer Specs

    • #1554242

      RG,

      this is a lot of extra work you didn’t have to do:D

      As you know baby steps – with me.

      RegEx, powershell and VBA are my weaknesses.

      I will power up the powershell and do some testing

      many thanks again – stellar work and overgenerous help:cool:

      pb

    • #1554294

      RG,

      thanks again for all your awesome help – the “OG” of powershell 😎

      I am starting to warm up to it and have made some baby scripts all of them one liners- be a pro in a few months 😀

      Code:
      $re = [regex] '{}'
      $re.Replace([string]::Join("`n", (gc C:UsersPBLDesktoptestregex.txt)), '####', 1)
      

      Regex and Powershell rolled into one – tried to avoid it for a long time but can no longer hide from it – so learning to make good friends with the powershell

      folks you all have a great week 😉

      cheers :cheers:

      pb

    Viewing 14 reply threads
    Reply To: Batch Script – Extract Text Line From Text Files

    You can use BBCodes to format your content.
    Your account can't use all available BBCodes, they will be stripped before saving.

    Your information: