Objective-C:逐行读取文件
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1044334/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Objective-C: Reading a file line by line
提问by
What is the appropriate way of dealing with large text files in Objective-C? Let's say I need to read each line separately and want to treat each line as an NSString. What is the most efficient way of doing this?
在 Objective-C 中处理大文本文件的合适方法是什么?假设我需要单独阅读每一行,并希望将每一行视为 NSString。这样做的最有效方法是什么?
One solution is using the NSString method:
一种解决方案是使用 NSString 方法:
+ (id)stringWithContentsOfFile:(NSString *)path
encoding:(NSStringEncoding)enc
error:(NSError **)error
and then split the lines with a newline separator, and then iterate over the elements in the array. However, this seems fairly inefficient. Is there no easy way to treat the file as a stream, enumerating over each line, instead of just reading it all in at once? Kinda like Java's java.io.BufferedReader.
然后用换行符分割行,然后遍历数组中的元素。然而,这似乎相当低效。有没有简单的方法可以将文件视为流,在每一行上进行枚举,而不是一次全部读取?有点像 Java 的 java.io.BufferedReader。
采纳答案by Quinn Taylor
That's a great question. I think @Diederikhas a good answer, although it's unfortunate that Cocoa doesn't have a mechanism for exactly what you want to do.
这是一个很好的问题。我认为@Diederik有一个很好的答案,尽管不幸的是 Cocoa 没有一个机制来满足你想要做的事情。
NSInputStreamallows you to read chunks of N bytes (very similar to java.io.BufferedReader), but you have to convert it to an NSStringon your own, then scan for newlines (or whatever other delimiter) and save any remaining characters for the next read, or read more characters if a newline hasn't been read yet. (NSFileHandlelets you read an NSDatawhich you can then convert to an NSString, but it's essentially the same process.)
NSInputStream允许您读取 N 个字节的块(非常类似于java.io.BufferedReader),但您必须自己将其转换为 an NSString,然后扫描换行符(或任何其他分隔符)并保存任何剩余字符以供下次读取,或读取更多字符如果尚未读取换行符。(NSFileHandle让您读取 anNSData然后您可以将其转换为 an NSString,但它本质上是相同的过程。)
Apple has a Stream Programming Guidethat can help fill in the details, and this SO questionmay help as well if you're going to be dealing with uint8_t*buffers.
Apple 有一个Stream Programming Guide可以帮助填写详细信息,如果您要处理缓冲区,这个 SO 问题也可能有所帮助uint8_t*。
If you're going to be reading strings like this frequently (especially in different parts of your program) it would be a good idea to encapsulate this behavior in a class that can handle the details for you, or even subclassing NSInputStream(it's designed to be subclassed) and adding methods that allow you to read exactly what you want.
如果您要经常读取这样的字符串(尤其是在程序的不同部分),最好将此行为封装在一个可以为您处理细节的类中,甚至是子类化NSInputStream(它被设计为subclassed) 并添加允许您准确阅读所需内容的方法。
For the record, I think this would be a nice feature to add, and I'll be filing an enhancement request for something that makes this possible. :-)
为了记录,我认为这将是一个很好的添加功能,我将提交一个增强请求,使之成为可能。:-)
Edit:Turns out this request already exists. There's a Radar dating from 2006 for this (rdar://4742914 for Apple-internal people).
编辑:原来这个请求已经存在。为此,有一个可追溯到 2006 年的雷达(rdar://4742914,适用于 Apple 内部人员)。
回答by Yoon Lee
This will work for general reading a Stringfrom Text.
If you would like to read longer text (large size of text), then use the method that other people here were mentioned such as buffered (reserve the size of the text in memory space).
这将适用于String从Text. 如果你想阅读更长的文本(大尺寸的文本),那么使用这里其他人提到的方法,例如缓冲(在内存空间中保留文本的大小)。
Say you read a Text File.
假设您阅读了一个文本文件。
NSString* filePath = @""//file path...
NSString* fileRoot = [[NSBundle mainBundle]
pathForResource:filePath ofType:@"txt"];
You want to get rid of new line.
你想摆脱新的线。
// read everything from text
NSString* fileContents =
[NSString stringWithContentsOfFile:fileRoot
encoding:NSUTF8StringEncoding error:nil];
// first, separate by new line
NSArray* allLinedStrings =
[fileContents componentsSeparatedByCharactersInSet:
[NSCharacterSet newlineCharacterSet]];
// then break down even further
NSString* strsInOneLine =
[allLinedStrings objectAtIndex:0];
// choose whatever input identity you have decided. in this case ;
NSArray* singleStrs =
[currentPointString componentsSeparatedByCharactersInSet:
[NSCharacterSet characterSetWithCharactersInString:@";"]];
There you have it.
你有它。
回答by Adam Rosenfield
This should do the trick:
这应该可以解决问题:
#include <stdio.h>
NSString *readLineAsNSString(FILE *file)
{
char buffer[4096];
// tune this capacity to your liking -- larger buffer sizes will be faster, but
// use more memory
NSMutableString *result = [NSMutableString stringWithCapacity:256];
// Read up to 4095 non-newline characters, then read and discard the newline
int charsRead;
do
{
if(fscanf(file, "%4095[^\n]%n%*c", buffer, &charsRead) == 1)
[result appendFormat:@"%s", buffer];
else
break;
} while(charsRead == 4095);
return result;
}
Use as follows:
使用方法如下:
FILE *file = fopen("myfile", "r");
// check for NULL
while(!feof(file))
{
NSString *line = readLineAsNSString(file);
// do stuff with line; line is autoreleased, so you should NOT release it (unless you also retain it beforehand)
}
fclose(file);
This code reads non-newline characters from the file, up to 4095 at a time. If you have a line that is longer than 4095 characters, it keeps reading until it hits a newline or end-of-file.
此代码从文件中读取非换行符,一次最多 4095 个。如果您有超过 4095 个字符的行,它会一直读取直到遇到换行符或文件结尾。
Note: I have not tested this code. Please test it before using it.
注意:我没有测试过这段代码。请在使用前进行测试。
回答by Kornel
Mac OS X is Unix, Objective-C is C superset, so you can just use old-school fopenand fgetsfrom <stdio.h>. It's guaranteed to work.
Mac OS X 是 Unix,Objective-C 是 C 超集,所以你可以使用 old-schoolfopen和fgetsfrom <stdio.h>. 它保证有效。
[NSString stringWithUTF8String:buf]will convert C string to NSString. There are also methods for creating strings in other encodings and creating without copying.
[NSString stringWithUTF8String:buf]将 C 字符串转换为NSString. 还有一些方法可以用其他编码创建字符串,不用复制就可以创建。
回答by diederikh
You can use NSInputStreamwhich has a basic implementation for file streams. You can read bytes into a buffer (read:maxLength:method). You have to scan the buffer for newlines yourself.
您可以使用NSInputStreamwhich 具有文件流的基本实现。您可以将字节读入缓冲区(read:maxLength:方法)。您必须自己扫描缓冲区中的换行符。
回答by DCurro
A lot of these answers are long chunks of code or they read in the entire file. I like to use the c methods for this very task.
很多这些答案都是很长的代码块,或者它们在整个文件中读取。我喜欢使用 c 方法来完成这项任务。
FILE* file = fopen("path to my file", "r");
size_t length;
char *cLine = fgetln(file,&length);
while (length>0) {
char str[length+1];
strncpy(str, cLine, length);
str[length] = 'DDFileReader * reader = [[DDFileReader alloc] initWithFilePath:pathToMyFile];
NSString * line = nil;
while ((line = [reader readLine])) {
NSLog(@"read line: %@", line);
}
[reader release];
';
NSString *line = [NSString stringWithFormat:@"%s",str];
% Do what you want here.
cLine = fgetln(file,&length);
}
Note that fgetln will not keep your newline character. Also, We +1 the length of the str because we want to make space for the NULL termination.
请注意, fgetln 不会保留您的换行符。此外,我们将 str 的长度加 1,因为我们想为 NULL 终止腾出空间。
回答by Stig Brautaset
The appropriate way to read text files in Cocoa/Objective-C is documented in Apple's String programming guide. The section for reading and writing filesshould be just what you're after. PS: What's a "line"? Two sections of a string separated by "\n"? Or "\r"? Or "\r\n"? Or maybe you're actually after paragraphs? The previously mentioned guide also includes a section on splitting a string into lines or paragraphs. (This section is called "Paragraphs and Line Breaks", and is linked to in the left-hand-side menu of the page I pointed to above. Unfortunately this site doesn't allow me to post more than one URL as I'm not a trustworthy user yet.)
在 Cocoa/Objective-C 中读取文本文件的适当方法记录在 Apple 的 String 编程指南中。读取和写入文件的部分应该正是您所追求的。PS:什么是“线”?由“\n”分隔的字符串的两个部分?还是“\r”?还是“\r\n”?或者,也许您实际上是在追求段落?前面提到的指南还包括有关将字符串拆分为行或段落的部分。(这一部分被称为“段落和换行符”,并在我上面指向的页面的左侧菜单中链接到。不幸的是,这个站点不允许我发布多个 URL,因为我是还不是值得信赖的用户。)
To paraphrase Knuth: premature optimisation is the root of all evil. Don't simply assume that "reading the whole file into memory" is slow. Have you benchmarked it? Do you know that it actuallyreads the whole file into memory? Maybe it simply returns a proxy object and keeps reading behind the scenes as you consume the string? (Disclaimer: I have no idea if NSString actually does this. It conceivably could.) The point is: first go with the documented way of doing things. Then, if benchmarks show that this doesn't have the performance you desire, optimise.
套用 Knuth 的话说:过早的优化是万恶之源。不要简单地假设“将整个文件读入内存”很慢。你对它进行了基准测试吗?你知道它实际上将整个文件读入内存吗?也许它只是返回一个代理对象并在您使用字符串时在幕后不断读取?(免责声明:我不知道 NSString 是否真的这样做了。可以想象。)重点是:首先采用记录在案的做事方式。然后,如果基准测试表明这没有您想要的性能,请进行优化。
回答by lukaswelte
To read a file line by line (also for extreme big files) can be done by the following functions:
可以通过以下函数逐行读取文件(也适用于超大文件):
DDFileReader * reader = [[DDFileReader alloc] initWithFilePath:pathToMyFile];
[reader enumerateLinesUsingBlock:^(NSString * line, BOOL * stop) {
NSLog(@"read line: %@", line);
}];
[reader release];
Or:
或者:
@interface DDFileReader : NSObject {
NSString * filePath;
NSFileHandle * fileHandle;
unsigned long long currentOffset;
unsigned long long totalFileLength;
NSString * lineDelimiter;
NSUInteger chunkSize;
}
@property (nonatomic, copy) NSString * lineDelimiter;
@property (nonatomic) NSUInteger chunkSize;
- (id) initWithFilePath:(NSString *)aPath;
- (NSString *) readLine;
- (NSString *) readTrimmedLine;
#if NS_BLOCKS_AVAILABLE
- (void) enumerateLinesUsingBlock:(void(^)(NSString*, BOOL *))block;
#endif
@end
The class DDFileReader that enables this is the following:
启用此功能的类 DDFileReader 如下所示:
Interface File (.h):
接口文件 (.h):
#import "DDFileReader.h"
@interface NSData (DDAdditions)
- (NSRange) rangeOfData_dd:(NSData *)dataToFind;
@end
@implementation NSData (DDAdditions)
- (NSRange) rangeOfData_dd:(NSData *)dataToFind {
const void * bytes = [self bytes];
NSUInteger length = [self length];
const void * searchBytes = [dataToFind bytes];
NSUInteger searchLength = [dataToFind length];
NSUInteger searchIndex = 0;
NSRange foundRange = {NSNotFound, searchLength};
for (NSUInteger index = 0; index < length; index++) {
if (((char *)bytes)[index] == ((char *)searchBytes)[searchIndex]) {
//the current character matches
if (foundRange.location == NSNotFound) {
foundRange.location = index;
}
searchIndex++;
if (searchIndex >= searchLength) { return foundRange; }
} else {
searchIndex = 0;
foundRange.location = NSNotFound;
}
}
return foundRange;
}
@end
@implementation DDFileReader
@synthesize lineDelimiter, chunkSize;
- (id) initWithFilePath:(NSString *)aPath {
if (self = [super init]) {
fileHandle = [NSFileHandle fileHandleForReadingAtPath:aPath];
if (fileHandle == nil) {
[self release]; return nil;
}
lineDelimiter = [[NSString alloc] initWithString:@"\n"];
[fileHandle retain];
filePath = [aPath retain];
currentOffset = 0ULL;
chunkSize = 10;
[fileHandle seekToEndOfFile];
totalFileLength = [fileHandle offsetInFile];
//we don't need to seek back, since readLine will do that.
}
return self;
}
- (void) dealloc {
[fileHandle closeFile];
[fileHandle release], fileHandle = nil;
[filePath release], filePath = nil;
[lineDelimiter release], lineDelimiter = nil;
currentOffset = 0ULL;
[super dealloc];
}
- (NSString *) readLine {
if (currentOffset >= totalFileLength) { return nil; }
NSData * newLineData = [lineDelimiter dataUsingEncoding:NSUTF8StringEncoding];
[fileHandle seekToFileOffset:currentOffset];
NSMutableData * currentData = [[NSMutableData alloc] init];
BOOL shouldReadMore = YES;
NSAutoreleasePool * readPool = [[NSAutoreleasePool alloc] init];
while (shouldReadMore) {
if (currentOffset >= totalFileLength) { break; }
NSData * chunk = [fileHandle readDataOfLength:chunkSize];
NSRange newLineRange = [chunk rangeOfData_dd:newLineData];
if (newLineRange.location != NSNotFound) {
//include the length so we can include the delimiter in the string
chunk = [chunk subdataWithRange:NSMakeRange(0, newLineRange.location+[newLineData length])];
shouldReadMore = NO;
}
[currentData appendData:chunk];
currentOffset += [chunk length];
}
[readPool release];
NSString * line = [[NSString alloc] initWithData:currentData encoding:NSUTF8StringEncoding];
[currentData release];
return [line autorelease];
}
- (NSString *) readTrimmedLine {
return [[self readLine] stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
}
#if NS_BLOCKS_AVAILABLE
- (void) enumerateLinesUsingBlock:(void(^)(NSString*, BOOL*))block {
NSString * line = nil;
BOOL stop = NO;
while (stop == NO && (line = [self readLine])) {
block(line, &stop);
}
}
#endif
@end
Implementation (.m)
实施 (.m)
NSString* fileRoot = [[NSBundle mainBundle] pathForResource:@"record" ofType:@"txt"];
FILE *file = fopen([fileRoot UTF8String], "r");
char buffer[256];
while (fgets(buffer, 256, file) != NULL){
NSString* result = [NSString stringWithUTF8String:buffer];
NSLog(@"%@",result);
}
The class was done by Dave DeLong
课程由Dave DeLong完成
回答by wdanxna
Just like @porneL said, the C api is very handy.
正如@porneL 所说,C api 非常方便。
#import <Foundation/Foundation.h>
@interface BRLineReader : NSObject
@property (readonly, nonatomic) NSData *data;
@property (readonly, nonatomic) NSUInteger linesRead;
@property (strong, nonatomic) NSCharacterSet *lineTrimCharacters;
@property (readonly, nonatomic) NSStringEncoding stringEncoding;
- (instancetype)initWithFile:(NSString *)filePath encoding:(NSStringEncoding)encoding;
- (instancetype)initWithData:(NSData *)data encoding:(NSStringEncoding)encoding;
- (NSString *)readLine;
- (NSString *)readTrimmedLine;
- (void)setLineSearchPosition:(NSUInteger)position;
@end
回答by Bj?rn Olav Ruud
As others have answered both NSInputStream and NSFileHandle are fine options, but it can also be done in a fairly compact way with NSData and memory mapping:
正如其他人所回答的, NSInputStream 和 NSFileHandle 都是不错的选择,但也可以通过 NSData 和内存映射以相当紧凑的方式完成:
BRLineReader.h
BRLineReader.h
#import "BRLineReader.h"
static unsigned char const BRLineReaderDelimiter = '\n';
@implementation BRLineReader
{
NSRange _lastRange;
}
- (instancetype)initWithFile:(NSString *)filePath encoding:(NSStringEncoding)encoding
{
self = [super init];
if (self) {
NSError *error = nil;
_data = [NSData dataWithContentsOfFile:filePath options:NSDataReadingMappedAlways error:&error];
if (!_data) {
NSLog(@"%@", [error localizedDescription]);
}
_stringEncoding = encoding;
_lineTrimCharacters = [NSCharacterSet whitespaceAndNewlineCharacterSet];
}
return self;
}
- (instancetype)initWithData:(NSData *)data encoding:(NSStringEncoding)encoding
{
self = [super init];
if (self) {
_data = data;
_stringEncoding = encoding;
_lineTrimCharacters = [NSCharacterSet whitespaceAndNewlineCharacterSet];
}
return self;
}
- (NSString *)readLine
{
NSUInteger dataLength = [_data length];
NSUInteger beginPos = _lastRange.location + _lastRange.length;
NSUInteger endPos = 0;
if (beginPos == dataLength) {
// End of file
return nil;
}
unsigned char *buffer = (unsigned char *)[_data bytes];
for (NSUInteger i = beginPos; i < dataLength; i++) {
endPos = i;
if (buffer[i] == BRLineReaderDelimiter) break;
}
// End of line found
_lastRange = NSMakeRange(beginPos, endPos - beginPos + 1);
NSData *lineData = [_data subdataWithRange:_lastRange];
NSString *line = [[NSString alloc] initWithData:lineData encoding:_stringEncoding];
_linesRead++;
return line;
}
- (NSString *)readTrimmedLine
{
return [[self readLine] stringByTrimmingCharactersInSet:_lineTrimCharacters];
}
- (void)setLineSearchPosition:(NSUInteger)position
{
_lastRange = NSMakeRange(position, 0);
_linesRead = 0;
}
@end
BRLineReader.m
BRLineReader.m
##代码##
