我得到了一个非常大的XML(>1GB),其中CDATA
内容包含一个包含76个字符和CRLF的base64,最后一个内容用"="填充,没有CRLF。
获得base64解码的最自然的方法是使用XmlReader读取器。ReadElementContentAsBase64(…),但它在100Mb时停止工作。(=OOM)
我在搜索流到Base64转换的方法后发现了这一点。
string Original = "foo bar, this is an example";
byte[] ToBase64; string Decoded;
using ( MemoryStream ms = new MemoryStream() ) using ( CryptoStream cs = new CryptoStream( ms, new ToBase64Transform(),
CryptoStreamMode.Write ) ) using ( StreamWriter st = new StreamWriter( cs ) ) {
st.Write( Original );
st.Flush();
ToBase64 = ms.ToArray(); }
using ( MemoryStream ms = new MemoryStream( ToBase64 ) ) using ( CryptoStream cs = new CryptoStream( ms, new FromBase64Transform(),
CryptoStreamMode.Read ) ) using ( StreamReader sr = new StreamReader( cs ) ) {
Decoded = sr.ReadToEnd(); }
Console.WriteLine( Original ); Console.WriteLine( Encoding.Default.GetString( ToBase64 ) ); Console.WriteLine( Decoded );
这个例子有一个字符串作为输入,但我需要转换这个代码来处理一个文件,并从文件中的一个位置开始/停止读取。
CData Start CDataEnd | | V V
[xml..[Envelope]…[Body]base64,带76个字符+CRLF+padding"="。。。[/Body]。。。[/Envelope]
这是我能创建的最好的。。int readbytes=0;
long bytesToRead = (CDataEnd - CDataStart);
using (CryptoStream cryptoStremFromBase64 = new CryptoStream(context.OutStream, new FromBase64Transform(FromBase64TransformMode.IgnoreWhiteSpaces), CryptoStreamMode.Write))
{
byte[] bytebuffer = new byte[bytesToRead % 10485760];
readbytes = context.InStream.Read(bytebuffer, 0, bytebuffer.Length);
cryptoStremFromBase64.Write(bytebuffer, 0, bytebuffer.Length);
while (context.InStream.Position < CDataEnd)
{
WriteToLog("Context.Instream.Position = " + context.InStream.Position.ToString(), context, startingTime);
byte[] bytebuffer2 = new byte[10485760];
readbytes = context.InStream.Read(bytebuffer2, 0, bytebuffer2.Length);
cryptoStremFromBase64.Write(bytebuffer2, 0, bytebuffer2.Length);
}
cryptoStremFromBase64.Flush();
}
上下文。InStream是基于64的SOAP上下文Outstream是主体从base64 转换后的输出
有什么办法可以改进吗?