Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LibreOffice zip file validation changes #817

Open
rombust opened this issue Nov 6, 2024 · 4 comments
Open

LibreOffice zip file validation changes #817

rombust opened this issue Nov 6, 2024 · 4 comments

Comments

@rombust
Copy link

rombust commented Nov 6, 2024

I use the minizip-ng library to modify "content.xml" in a LibreOffice Draw document, in my C++ application

Since LibreOffice "LibreOffice_24.8.2_Win_x86-64", when opening the document, I get "The file "example.odg" is corrupt and therefore cannot be opened. LibreOffice can try to repair the file." . When the document is "repaired", the file opens as expected.

This is also in the daily development build v24.8 - 2024-11-06_09.10.55.
It works in LibreOffice 7.6.7.2

To simplify the debugging, I can recreate the issue as follows (with setup and error checking removed)

err = mz_zip_reader_goto_first_entry(zip_reader);
do
   {
     err = mz_zip_reader_entry_get_info(zip_reader, &file_info);
     err = mz_zip_writer_copy_from_reader(zip_writer, zip_reader);
    err = mz_zip_reader_goto_next_entry(zip_reader);
  while(err != MZ_END_OF_LIST)

The example.odg was created by LibreOffice.
If I use "minizip.exe" to extract the files, then "minizip.exe" to compress the files into "example_minizip_fixed.odg", my code works.

Looking at the list of these files, they differ. Notice the Libreoffice version (example.odg), "Attribs" are set to 0


C:\tmp>minizip -l  example.odg
     Packed     Unpacked Ratio Method   Attribs Date     Time  CRC-32     Name
      ------     -------- ----- ------   ------- ----     ----  ------     ----
          43           43  100% stored         0 11-06-24 10:08 c42e039f   mimetype
           0            0    0% stored         0 11-06-24 10:08 00000000   Configurations2/menubar/
           0            0    0% stored         0 11-06-24 10:08 00000000   Configurations2/progressbar/
           0            0    0% stored         0 11-06-24 10:08 00000000   Configurations2/popupmenu/
           0            0    0% stored         0 11-06-24 10:08 00000000   Configurations2/toolbar/
           0            0    0% stored         0 11-06-24 10:08 00000000   Configurations2/statusbar/
           0            0    0% stored         0 11-06-24 10:08 00000000   Configurations2/images/Bitmaps/
           0            0    0% stored         0 11-06-24 10:08 00000000   Configurations2/floater/
           0            0    0% stored         0 11-06-24 10:08 00000000   Configurations2/toolpanel/
           0            0    0% stored         0 11-06-24 10:08 00000000   Configurations2/accelerator/
        3000        19644   15% deflate         0 11-06-24 10:08 55f98da4   styles.xml
        1358         7360   18% deflate         0 11-06-24 10:08 af4b7d1c   content.xml
        1068         1068  100% stored         0 11-06-24 10:08 fcd7adaa   Thumbnails/thumbnail.png
        2552        50225    5% deflate         0 11-06-24 10:08 0283cbd5   settings.xml
         456         1094   41% deflate         0 11-06-24 10:08 9d6f981a   meta.xml
         292          965   30% deflate         0 11-06-24 10:08 73935a56   META-INF/manifest.xml

C:\tmp>minizip -l  example_minizip_fixed.odg
     Packed     Unpacked Ratio Method   Attribs Date     Time  CRC-32     Name
      ------     -------- ----- ------   ------- ----     ----  ------     ----
           0            0    0% stored        10 11-06-24 10:27 00000000   Configurations2/
           0            0    0% stored        10 11-06-24 10:27 00000000   Configurations2/accelerator/
           0            0    0% stored        10 11-06-24 10:27 00000000   Configurations2/floater/
           0            0    0% stored        10 11-06-24 10:27 00000000   Configurations2/images/
           0            0    0% stored        10 11-06-24 10:27 00000000   Configurations2/images/Bitmaps/
           0            0    0% stored        10 11-06-24 10:27 00000000   Configurations2/menubar/
           0            0    0% stored        10 11-06-24 10:27 00000000   Configurations2/popupmenu/
           0            0    0% stored        10 11-06-24 10:27 00000000   Configurations2/progressbar/
           0            0    0% stored        10 11-06-24 10:27 00000000   Configurations2/statusbar/
           0            0    0% stored        10 11-06-24 10:27 00000000   Configurations2/toolbar/
           0            0    0% stored        10 11-06-24 10:27 00000000   Configurations2/toolpanel/
        1354         7360   18% deflate        80 11-06-24 10:08 af4b7d1c   content.xml
           0            0    0% stored        10 11-06-24 10:27 00000000   META-INF/
         290          965   30% deflate        80 11-06-24 10:08 73935a56   META-INF/manifest.xml
         459         1094   41% deflate        80 11-06-24 10:08 9d6f981a   meta.xml
          43           43  100% deflate        80 11-06-24 10:08 c42e039f   mimetype
        2520        50225    5% deflate        80 11-06-24 10:08 0283cbd5   settings.xml
        3002        19644   15% deflate        80 11-06-24 10:08 55f98da4   styles.xml
           0            0    0% stored        10 11-06-24 10:27 00000000   Thumbnails/
         659         1068   61% deflate        80 11-06-24 10:08 fcd7adaa   Thumbnails/thumbnail.png

Looking into


int32_t mz_zip_writer_entry_open(void *handle, mz_zip_file *file_info) {
. . .
   if (mz_zip_attrib_is_dir(writer->file_info.external_fa, writer->file_info.version_madeby) != MZ_OK) 
. . . 

The external_fa is set to 0, therefore "mz_zip_attrib_is_dir" will never return MZ_OK. Thus the hash is always calculated.
Note, mz_zip_writer_copy_from_reader also uses mz_zip_attrib_is_dir .

I tried forcing the correct Attribs, thinking this was the source of the problem. However LibreOffice still reported the file is corrupt.

Has anyone else seen this?

Observation 1) minizip.exe stores "Configurations2/" name in the above example. I am unsure if that's an issue.
Observation 2) The output.odg file that my code (using mz_zip_writer_copy_from_reader) creates. Using minizip.exe, If I extract it, then recompress it. LibreOffice will open the file without issues.

@pmqs
Copy link
Contributor

pmqs commented Nov 7, 2024

It's been a while, but my recollection of the equivalent issue with Office files, which also use zip behind the scenes, is that the exact layout of the zipfile needs to be maintained. So things like the order the files are stored in the zip will matter. Whether they are compressed or stored and whether the file is stored streamed or not streamed.

In this case the two things that stand out are the order of the files and the presence of the directory entries (Configurations2/ and Thumbnails/.

Whether one of these is the root-case here is difficult to tell definitively without knowing ho Libre handles zip files -- I just know that the equivalent Office file were very picky about the structure of the zipfile.

You could also try running zipdetails on the before & after zipfiles to see what other internal metadata is different.

@rombust
Copy link
Author

rombust commented Nov 7, 2024

Thank you. I will have a look at that. "zipdetails" will be helpful.

I have tracked down to when LibreOffice changed (downloading and debugging source)


package/source/zippackage/ZipPackage.cxx 
32cad89 package: ZipPackage: add additional check for entries STORED with by Michael Stahl · 3 months ago

In function they added below, the "entry STORED with data descriptor but not encrypted" exception is thrown.


void ZipPackage::checkZipEntriesWithDD()
{
    if (!m_bForceRecovery)
    {
        ZipEnumeration entries{m_pZipFile->entries()};
        while (entries.hasMoreElements())
        {
            ZipEntry const& rEntry{*entries.nextElement()};
            if ((rEntry.nFlag & 0x08) != 0 && rEntry.nMethod == STORED)
            {
                uno::Reference<XPropertySet> xStream;
                getByHierarchicalName(rEntry.sPath) >>= xStream;
                if (!xStream->getPropertyValue("WasEncrypted").get<bool>())
                {
                    SAL_INFO("package", "entry STORED with data descriptor but not encrypted: \"" << rEntry.sPath << "\"");
                    throw ZipIOException(
                        THROW_WHERE
                        "entry STORED with data descriptor but not encrypted");
                }
            }
        }
    }
}

I will investigate. Thank you

@rombust
Copy link
Author

rombust commented Nov 8, 2024

I have created a LibreOffice bug report for comments. Bug 163818 - Opening ODG file is reported to be corrupted when created by 3rd party software .

@rombust
Copy link
Author

rombust commented Nov 12, 2024

I still haven't located the problem. A lot of red herrings and going around in circles! (This usually means that I am missing something obvious)

A workaround:

Instead of using "mz_zip_writer_copy_from_reader(zip_writer, zip_reader);"

The following code fixes the issue:

mz_zip_file* file_info = nullptr;
err = mz_zip_reader_entry_get_info(zip_reader, &file_info);
if (err != MZ_OK)
{
	(handle error)
}
err = mz_zip_reader_entry_open(zip_reader);
if (err != MZ_OK)
{
	(handle error)
}

mz_zip_file new_file_info{ 0 };
new_file_info.version_madeby = file_info->version_madeby;// 2605;
new_file_info.compression_method = 8;
new_file_info.filename = file_info->filename;
new_file_info.filename_size = file_info->filename_size;
new_file_info.uncompressed_size = file_info->uncompressed_size;
new_file_info.flag = file_info->flag;

std::vector<char> dest;
dest.resize(file_info->uncompressed_size + 1);

if (file_info->uncompressed_size)
{
	auto size = mz_zip_reader_entry_read(zip_reader, &dest[0], file_info->uncompressed_size);
	if (size != file_info->uncompressed_size)
	{
		(handle error)
	}
}

err = zip_minizip_writer.mz_zip_writer_entry_open(&new_file_info);
if (err != MZ_OK)
{
	(handle error)
}

if (mz_zip_writer_entry_write(zip_writer, &dest[0], file_info->uncompressed_size) != file_info->uncompressed_size)
{
	((handle error)
}
mz_zip_writer_entry_close(zip_writer);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants