I have two data sets, A and B. The format for each data set is one number per line. For instance,
12345
23456
67891
2345900
12345
Some of the data in A are not included in data set B. How to list all of these data in A, and how to list all of those data shared by A and B. How can I do that using Linux/UNIX commands?
I try to count number of lines of a data.txt file and then if number of lines is greater than 1 then email me the file.
I could not find what is wrong with my code, hope you can point out the mistake i made
Code:
#!
I have a huge amount of data in which each (data-)line should be unique.
There are a lot of files in one folder in which this is already true. It is about 15GB splitted into roughly 170 files with 1000000 lines. Let's call that folder foo.
Now there is a second folder (bar) with even more data: In each file, there are no multiple entries.
Hi
I have 1000 files labelled data1.txt through data1000.txt. I want to write a script that prints out the number of lines in each txt file and outputs it in the following format:
Column 1: number of data file (1 through 1000)
Column 2: number of lines in the text file
Thanks!
I have a big data project that has thousands of entries. The data has roughly 20 columns including cylinders, gas mileage, make, model etc. I'm using awk to output all the data. I have to organize the data into a nice table.
Is there a way with aw to print all lines after a string is found
There is a file like this
Code:
.......
........
2012/19/11 :11.58 PM some data
lne no date
2012/19/11 :11.59 PM some other data
2012/20/11 :12.00 AM some other data
some line without dates
some more lines without dates
2012/20/11 :12.01 AM some more data
....
.......
I need a way to display all lines after the f
Hi All,
Have a requirement where i needs to alter the Content of File once it is generated.
The File consist of Multiple line ...range from 2000-8000 lines.
I need to change the Content of Lines which match this Pattern
1|DT_Buried_Strap_Resistor~RES|DT_Buried_Strap_Resistor~RES|-1.000000e+20|1.000000e+20|-1.000000e+20|1.000000e+20|-1.000000e+20|1.000000e+20|VorR|0|0|
In short it must hav
i have a file that's about 2GB, i have to get the total number of lines in this file every 10 minutes.
the interval is not an issue. i just need the proper, most efficient way to do this.
any ideas?
i got the following from another thread on this site, but:
Code:
awk 'int(100*rand())%5<1' file
but this randomly pulls out 20% of lines in a file.
I am a newbie to shell scripting
I have a large log file , i need to work on the part of the log file for a particular date.
Is there a way to find the first occurance of the date string and last occurance of the next day date date string and move this section to a new file.
to explain it further.
There is a file like this
-------------------
10/30/2012 some old data
10/31/2012 some dat