: : : I wrote a Delphi program in which we need to access to memory a lot.But unfortunately, it is very time consuming on my Pentium 4 (2.8GHz 512MB ) To speed up it someone advices me to a 64bit CPU. But do you agree with them? Thanks
: : :
: : That won't help, because your program is probably getting slow due to the latency (http://en.wikipedia.org/wiki/RAM_latency) to get its data from the memory. The best way to tackly this problem is to design another algorithm, which doesn't need to access the memory so much.
: :
: Thanks. You are right. But please consider the following codes:
:
: //Code #1
: var M,N:array of array of byte;
: T:integer;
: ...
: begin
: SetLength(M,640,640);
: SetLength(N,640,640);
:
: //Fill each array with some data
: FillMatrix(M);
: FillMatrix(N);
:
: T:=Gettickcount;
: //Do some operations with M
: Operate(M);
:
: //Do some operations with N
: Operate(N);
:
: showmessage(inttostr(Gettickcount-T)); // T=800 msec
: end;
:
: In the above code, the whole process takes about 800msec. Intresting is that if I change Operate(N) to Operate(M), the whole process takes about 650msec !!! In his case we have the following code:
:
: //Code #1
: var M,N:array of array of byte;
: T:integer;
: ...
: begin
: SetLength(M,640,640);
: SetLength(N,640,640);
:
: //Fill each array with some data
: FillMatrix(M);
: FillMatrix(N);
:
: T:=Gettickcount;
: //Do some operations with M
: Operate(M);
:
: //Do some operations with M
: Operate(M);
:
: showmessage(inttostr(Gettickcount-T)); // T=650 msec
: end;
:
: But I am wondering why this happen?!! Do you know its reason? I have another question too. Please consider the following codes:
:
: //code 3
: var i,t:integer;
: ...
: t:=i;
:
:
: //code 4
: var t:integer;
: M:array[0..20] of array[0..20] of integer;
: ...
: t:=M[10,10];
:
: Which one is faster? (Code 3 or Code 4)
:
I don't know the answer to the first question, but it might have to do with the L1 cache or with the optimalization of the compiler.
The answer to your second question is: Code 3. Because in Code 4 the compiler needs to calculate the memory location based on the indexes. The assignment itself is equally fast.