- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Dave
링크가 복사됨
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
We access the file from many different programs so the format needs to stay the same. The file isn't written sequentially (although most of the data we write into is sequential in spots). When we read it we are computing where to read data from in the file as we ususally only wantvery small pieces of it. The size of the file varies as the number of records vary with each file. In the case I have the file grows to about 900 MB, and it has several thousands cases on it, each made up of a lot of records.It takes about 2 minutes to write each case out, so it takes about 12 hours to do all the writingfor this case (the rest of the computations take about 2 days). Reducing this would be very helpful. I know writing it out sequentially takes very little time.
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Have you run the program through a performance analyzer such as Intel VTune Amplifier XE to see where the time is being spent? If you're waiting for the OS to complete the write or the position, there's not much you can do on the Fortran side.
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
On the plus side we are in the process of switching to a database so maybe that will be an improvement.
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Consider writing your updates into a seperate file(s). Then run a sequential merge into your 900MB file.
The merge of 900MB main file and one or more small files could be on the order of 100MB/sec.
Your milage may vary. Can your main file be "off line" for under a minute?
Jim Dempsey
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
[SergeyK]Windows OSdoesn't have and doesn't imposeany I/O limitations for applications. It tries
toallocate as much as possible memory for datacaches and then data from these caches
is written to anHDD.
On the plus side we are in the process of switching to a database so maybe that will be an improvement.
[SergeyK] What database are you going to use?
Alow-level APIdeveloped inC/C++to store some data from a Fortran application could be another option.
But, since a database will be used anywayit doesn't make sensetospend time on a such API.
I'm planning some performance tests / evaluations with some C/C++ APInext week and I could provide
withsome numbers forcases with1GB and 2GBdata files ( in binary and txt formats ).
Best regards,
Sergey
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
>>it takes about 12 hours to do all the writingfor this case
I verified performance of a low level API ( C/C++ ) for writing some data into a data file for Text and Binary formats.
In case of a Text format a content looked like:
ROW1=
ROW2=
...
ROWN=
In case of a Binary format a content looked like:
ROW1=
ROW2=
...
ROWN=
Here are results and numbers are relative:
Text Format Binary Format
Data file size 250MB 1x ~5.0x faster
Data file size 500MB 1x ~4.5x faster
Data file size 1GB 1x ~4.0x faster
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Don''t forget the physical layer. As far as I know, data are written on the disk by cluster. So If you want to write 8x16 bytes, the cluster in which your data has to be written must be read in memory from the disk, then your data is inserted in it and the cluster has to be written back to the disk. For a common cluster size of 4kb, you read and write 4kb each time you believe you are writing only 108b.
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Thanks for the note. I think nobody would argue regarding this. Here are some additional details. All tests
were writing a content of a matrix of single-precision values ( 'float' data type):
Size of output txt-file: ~250MB - Matrix size: 5250x 5250
Size of output txt-file: ~500MB - Matrix size: 8000x 8000
Size of output txt-file: ~ 1GB - Matrix size: 10750x10750
Note: To calculate a totallength of a rowmultiply a matrix 'm' valueby '9'. For example, 5250 x 9 = 47250 bytes
