Reading a large file

I'm trying to open up a large text file (180mb) with +/-500 000 records inside. I have to process each line and insert it into my DB when it has passes the validations. The problem is that is takes 8rec/sec to process which means it takes a couple of hours to do the lot. Can anyone help me with a faster/efficient way of reading files?

Comments

  • : I'm trying to open up a large text file (180mb) with +/-500 000
    : records inside. I have to process each line and insert it into my DB
    : when it has passes the validations. The problem is that is takes
    : 8rec/sec to process which means it takes a couple of hours to do the
    : lot. Can anyone help me with a faster/efficient way of reading files?
    :
    :
    I'm more inclined to suspect the validation or the insert statements as the guilty party. Reading a text-file line by line is quite fast. For one of my projects I had to read and parse 40000 lines (~7 MB), which took about 30 seconds. Expanding that I would suspect that reading your file should take about 10 mins.
    You should test that using only the read part. Then adding the validation and finally adding only the insertions (with some general exception handling). This should give you a clear idea where the lack of speed comes from.
  • : : I'm trying to open up a large text file (180mb) with +/-500 000
    : : records inside. I have to process each line and insert it into my DB
    : : when it has passes the validations. The problem is that is takes
    : : 8rec/sec to process which means it takes a couple of hours to do the
    : : lot. Can anyone help me with a faster/efficient way of reading files?
    : :
    : :
    : I'm more inclined to suspect the validation or the insert statements
    : as the guilty party. Reading a text-file line by line is quite fast.
    : For one of my projects I had to read and parse 40000 lines (~7 MB),
    : which took about 30 seconds. Expanding that I would suspect that
    : reading your file should take about 10 mins.
    : You should test that using only the read part. Then adding the
    : validation and finally adding only the insertions (with some general
    : exception handling). This should give you a clear idea where the
    : lack of speed comes from.

    Hi zibadian, thanks for the reply. I took your advice and uncommented all my validation & insert code. After 10min I was on record 30648 which means that my validations & inserts are slow, but still, the file reading is still way too slow. I'm using the assignfile & readln procedure. I need a method in which I can speed this up. Any suggestions on how I can achieve this?
  • : : : I'm trying to open up a large text file (180mb) with +/-500 000
    : : : records inside. I have to process each line and insert it into my DB
    : : : when it has passes the validations. The problem is that is takes
    : : : 8rec/sec to process which means it takes a couple of hours to do the
    : : : lot. Can anyone help me with a faster/efficient way of reading files?
    : : :
    : : :
    : : I'm more inclined to suspect the validation or the insert statements
    : : as the guilty party. Reading a text-file line by line is quite fast.
    : : For one of my projects I had to read and parse 40000 lines (~7 MB),
    : : which took about 30 seconds. Expanding that I would suspect that
    : : reading your file should take about 10 mins.
    : : You should test that using only the read part. Then adding the
    : : validation and finally adding only the insertions (with some general
    : : exception handling). This should give you a clear idea where the
    : : lack of speed comes from.
    :
    : Hi zibadian, thanks for the reply. I took your advice and
    : uncommented all my validation & insert code. After 10min I was on
    : record 30648 which means that my validations & inserts are slow, but
    : still, the file reading is still way too slow. I'm using the
    : assignfile & readln procedure. I need a method in which I can speed
    : this up. Any suggestions on how I can achieve this?
    :
    You could try to read the file into a TStrings object. Then use that object to parse the records.
    Another option is to use a TMemoryStream to store the entire file in, and then parse that to read the records. This is more difficult, because there's no implicit function/method to read a line from a TMemoryStream.
    If those two options don't work, then you'll need faster hardware.
  • : : : : I'm trying to open up a large text file (180mb) with +/-500 000
    : : : : records inside. I have to process each line and insert it into my DB
    : : : : when it has passes the validations. The problem is that is takes
    : : : : 8rec/sec to process which means it takes a couple of hours to do the
    : : : : lot. Can anyone help me with a faster/efficient way of reading files?
    : : : :
    : : : :
    : : : I'm more inclined to suspect the validation or the insert statements
    : : : as the guilty party. Reading a text-file line by line is quite fast.
    : : : For one of my projects I had to read and parse 40000 lines (~7 MB),
    : : : which took about 30 seconds. Expanding that I would suspect that
    : : : reading your file should take about 10 mins.
    : : : You should test that using only the read part. Then adding the
    : : : validation and finally adding only the insertions (with some general
    : : : exception handling). This should give you a clear idea where the
    : : : lack of speed comes from.
    : :
    : : Hi zibadian, thanks for the reply. I took your advice and
    : : uncommented all my validation & insert code. After 10min I was on
    : : record 30648 which means that my validations & inserts are slow, but
    : : still, the file reading is still way too slow. I'm using the
    : : assignfile & readln procedure. I need a method in which I can speed
    : : this up. Any suggestions on how I can achieve this?
    : :
    : You could try to read the file into a TStrings object. Then use that
    : object to parse the records.
    : Another option is to use a TMemoryStream to store the entire file
    : in, and then parse that to read the records. This is more difficult,
    : because there's no implicit function/method to read a line from a
    : TMemoryStream.
    : If those two options don't work, then you'll need faster hardware.

    I already have tried reading it into a TString object, but the speed is roughly the same. I will try to use the TMemoryStream object and see if I have any success with it and will let you know.
  • Use TMemoryStream, I can read 700mb of smalls files (3kb each) in 3 seconds.
    Is more fast read all file and take in memory than you read 3kbytes from HD each time that you want the next line.
    I don't guess hard to find the lines in TMemoryStream, after load, you can create a function

    function readLineFromMemory(x: TMemoryStream): string;
    var
    s: string;
    c: char;
    i: integer;
    begin
    i := x.position;
    while (I < x.size) do
    begin
    x.read(c, size_of(c));
    inc(I);
    s := s + c;
    if c = #13 then break;
    end;
    result := s;
    end;

    I don't tested it.
Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Categories