[jsword-devel] Code optimisation request

Sat Jun 26 06:56:30 MST 2010

Hi DM,

If the block is big then I would need to look at changing the BlockType of
the bible to chapter from book but I am hoping to avoid that because it
would be nice just to download standard documents form CrossWire and all the
bibles I have seen have a block type of book.  Although if there were some
bibles with a block type of chapter I think it would be more suitable for a
mobile.

Setting the expectedLength on the baos can save up to 2x the amount of
memory allocation e.g. if required size is1025 then allocated memory will
reach 2048 using the default doubling method before being sufficient.

I haven't gone into InflaterInputStream and couldn't follow all your logic
but are you agreeing that removing the redundant buffer and setting the
initial size of the baos correctly are good ideas?  Anyway I found that
making both changes had an instant impact on OOMs although it could just be
that just one of the changes was significant.

Many thanks
Martin

On 26 June 2010 13:21, DM Smith <dmsmith at crosswire.org> wrote:

> Welcome Martin,
>
> I'm excited about your work.
>
> Yes you can make changes! But not directly. Only a few people can commit
> changes directly.
>
> The preferred method is to open an issue in Jira at www.crosswire.org/bugsunder the appropriate bucket. In this case it would be "Common Java
> Utilities". And attach a patch to it.
>
> This does several things:
> 1) It keeps the issue from getting lost in this mailing list.
> 2) It allows for discussion/iteration there.
> 3) It is a centralized location for changes, allowing one to know what went
> into a release.
> ...
>
> That said, see below:
>
> In Him,
> DM
>
> On Jun 26, 2010, at 5:22 AM, Martin Denham wrote:
>
> Hi,
>
> I am not sure if I can make code changes in the JSword-common but I haven't
> made any before so I thought I would request the change.
>
> In Zip.java, in the uncompress method, could you change:
>         ByteArrayOutputStream bos = new ByteArrayOutputStream();
>         BufferedOutputStream out = new BufferedOutputStream(bos,
> expectedLength);
>         InflaterInputStream in = new InflaterInputStream(input, new
> Inflater(), expectedLength);
>         byte[] buf = new byte[expectedLength];
> To:
>         ByteArrayOutputStream out = new
> ByteArrayOutputStream(expectedLength);
>         InflaterInputStream in = new InflaterInputStream(input, new
> Inflater(), expectedLength);
>         byte[] buf = new byte[expectedLength];
>
> There are 2 optimisations:
>
>    1. ByteArrrayOutputStream is passed the expectedLength parameter
>
> I traced through the code and I'm not sure that this fixes the problems. If
> expectedLength is as is implied by the routine, the expected length of the
> uncompressed result, and a block is big, there might still be OOMs.
>
> The ByteArrayOutputStream is naive in its allocation of additional space
> for its internal buffer. If not given a size, it starts at 32 and then
> doubles each time it needs to grow the buffer. Setting the initial size to
> the expected size of the final result, eliminates re-allocation. This change
> is an important speed improvement. And if garbage collection is not
> optimized, it will save memory.
>
> At the end, the buffer is retrieved with toByteArray, which creates a copy
> of the buffer. So at a minimum, with this change there are two copies until
> "bos" is garbage collected. To eliminate that final copy, we'd have to write
> our own or find something else.
>
> I also looked at InflaterInputStream. Here I am not sure that one should
> pass it expectedLength. It will set the internal size of its buffer. I'm not
> sure this buffer needs to be the expected size. It defaults to 512 and once
> set does not grow. It just controls how often one gets more bytes from the
> stream to inflate. I'm thinking that perhaps it should be small, either the
> default or perhaps 1024, 2048, 4096, ... Maybe min(expectedLength, 4096)?
>
> Allocation in Zip.java of buf, probably doesn't need to be the expected
> size either. It can probably be set to the same as what is passed to the
> inflater stream.
>
> So right now, I think that with your improvement, to unzip requires 4x the
> expected length or more.
>
>
>
>
>    1. BufferedOutputStream is not used because there is no benefit to
>    buffering byte array streams.
>
> Yes, I see this now. Reading the file may need to be buffered, but he byte
> array streams have their own internal buffering.
>
>
>
> These changes may seem trivial but I am attempting to use JSword on an
> Android mobile and the above prevents many OutOfMemory errors.
>
> Many thanks
> Martin
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel
>
>
>
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20100626/507f5565/attachment-0001.html>