Does Microsoft OneDrive export large ZIP files that are corrupt?

11 March 2020

“Broken zip £12.50” by dichoecho, used under CC BY / Cropped from original.

We recently started using Microsoft OneDrive at work. The other day a colleague used OneDrive to share a folder with a large number of ISO images with me. Since I wanted to work with these files on my Linux machine at home, and no official OneDrive client for Linux exists a this point, I used OneDrive’s web client to download the contents of the folder. Doing so resulted in a 6 GB ZIP archive. When I tried to extract this ZIP file with my operating system’s (Linux Mint 19.3 MATE) archive manager, this resulted in an error dialog, saying that “An error occurred while loading the archive”:

The output from the underlying extraction tool (7-zip) reported a “Headers Error”, with an “Unconfirmed start of archive”. It also reported a warning that “There are data after the end of archive”. No actual data were extracted whatsoever. This all looked a bit worrying, so I decided to have a more in-depth look at this problem.

Extracting with Unzip

As a first test I tried to extract the file from the terminal using unzip (v. 6.0) using the following command:

unzip kb-4d8a2f9a-5e0b-11ea-9376-40b0341fbf5f.zip

This resulted in the following output:

Archive:  kb-4d8a2f9a-5e0b-11ea-9376-40b0341fbf5f.zip
warning [kb-4d8a2f9a-5e0b-11ea-9376-40b0341fbf5f.zip]:  1859568605 extra bytes at beginning or within zipfile
  (attempting to process anyway)
error [kb-4d8a2f9a-5e0b-11ea-9376-40b0341fbf5f.zip]:  start of central directory not found;
  zipfile corrupt.
  (please check that you have transferred or created the zipfile in the
  appropriate BINARY mode and that you have compiled UnZip properly)

So, according to unzip the file is simply corrupt. Unzip wasn’t able to extract any actual data.

Extracting with 7-zip

Next I tried to to extract the file with 7-zip (v. 16.02) using this command:

7z x kb-4d8a2f9a-5e0b-11ea-9376-40b0341fbf5f.zip

This resulted in the following (lengthy) output:

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,4 CPUs Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz (506E3),ASM,AES-NI)

Scanning the drive for archives:
1 file, 6154566547 bytes (5870 MiB)

Extracting archive: kb-4d8a2f9a-5e0b-11ea-9376-40b0341fbf5f.zip

ERRORS:
Headers Error
Unconfirmed start of archive


WARNINGS:
There are data after the end of archive

--
Path = kb-4d8a2f9a-5e0b-11ea-9376-40b0341fbf5f.zip
Type = zip
ERRORS:
Headers Error
Unconfirmed start of archive
WARNINGS:
There are data after the end of archive
Physical Size = 4330182775
Tail Size = 1824383772

ERROR: CRC Failed : kb-4d8a2f9a-5e0b-11ea-9376-40b0341fbf5f/afd3f61a-5e0e-11ea-ab97-40b0341fbf5f/08.wav
                                                                              
Sub items Errors: 1

Archives with Errors: 1

Warnings: 1

Open Errors: 1

Sub items Errors: 1

Here we see the familiar “Headers Error” and “Unconfirmed start of archive” errors, as well as a warning about a cyclic redundancy check that failed on an extracted file. Unlike unzip, 7-zip does succeed in extracting some of the data, but seeing that the size of the extracted folder is only 4.3 GB, the extraction is incomplete (the size of the ZIP file is 6 GB!).

4 GiB size limit and ZIP64

At this point I started wondering if these issues could be related to the size of this particular ZIP file, especially since I have been able to process zipped OneDrive folders before without any problems. The Wikipedia entry on ZIP states that originally the format had a 4 GiB limit on the total size of the archive (as well as both the uncompressed and compressed size of a file). To overcome these limitations, a “ZIP64” extension was added to the format in version 4.5 of the ZIP specification (which was published in 2001). To be sure, I verified that both unzip and 7-zip on my machine support ZIP64¹.

Small OneDrive ZIP and home-rolled large ZIP

I did some additional tests to verify if my problem could be a ZIP64-related issue. First I downloaded a smaller (<4 GB) folder from OneDrive, and tried to extract the resulting ZIP file with unzip and 7-zip. Both were able to extract the file without any issues. Next I created two 8 GB ZIP files from data on my local machine with both the zip and 7-zip tools. I then tried to extract both files with both unzip and 7-zip (i.e. I extracted each file with both tools). Again, both extracted these files without any problems. Since these tests demonstrate that both unzip and 7-zip are able to handle both large ZIP files (which by definition use ZIP64) as well as smaller OneDrive ZIP files, this suggests that something odd is going on with OneDrive’s implementation of ZIP64.

Testing the ZIP file integrity

The zip tool has a switch that can be used to test the integrity of a ZIP file. I ran it on the problematic file like this:

zip -T kb-4d8a2f9a-5e0b-11ea-9376-40b0341fbf5f.zip

Here’s the result:

Could not find:
  kb-4d8a2f9a-5e0b-11ea-9376-40b0341fbf5f.z01

Hit c      (change path to where this split file is)
    q      (abort archive - quit)
 or ENTER  (try reading this split again): 

So, apparently the zip utility thinks this is a multi-volume archive (which it isn’t). Running this command on any of my other test files (the small OneDrive file, and the large files created by zip and 7-zip) didn’t result in any errors.

Tests with Python’s zipfile module

The Python programming language by default includes a zipfile module, which has tools for reading and writing ZIP files. So, I wrote the following script, which opens the ZIP file in read mode, and then reads its contents (I used Python 3.6.9 for this):

import zipfile

# Open ZIP file
myZip = zipfile.ZipFile("kb-4d8a2f9a-5e0b-11ea-9376-40b0341fbf5f.zip",
                        mode='r')

# Read all files in archive and check their CRCs and file headers.
myZip.testzip()

# Close the ZIP file
myZip.close()

Running the script raised the following error:

zipfile.BadZipFile: zipfiles that span multiple disks are not supported

This looks somewhat related to the outcome of zip’s integrity test, which reported a multi-volume archive. Looking at the source code of the zipfile module shows that this particular error is raised if a check on 2 data fields from the “zip64 end of central dir locator” fails:

if diskno != 0 or disks > 1:
  raise BadZipFile("zipfiles that span multiple disks are not supported")

Here’s the description of this data structure in the format specification:

zip64 end of central dir locator 
      signature                       4 bytes  (0x07064b50)
      number of the disk with the
      start of the zip64 end of 
      central directory               4 bytes
      relative offset of the zip64
      end of central directory record 8 bytes
      total number of disks           4 bytes

Python’s zipfile raises the error if either the value of the “number of the disk with the start of the zip64 end of central directory” field (variable diskno) isn’t equal to 0, or the “total number of disks” (variable disks) is larger than 1. So, I opened the file in a Hex editor, and zoomed in on the “zip64 end of central dir locator”:

Here, the highlighted bytes (0x504b0607) make up the signature of the “zip64 end of central dir locator”². The 4 bytes inside the blue rectangle contain the “number of the disk” value. Here, its value is 0, which is the correct and expected value. The 4 bytes inside the red rectangle contain the “total number of disks” value, which is also 0. But this is really odd, since neither value should trigger the “zipfiles that span multiple disks are not supported” error! Also, a check on the 8 GB ZIP files that I had created myself with zip and 7-zip showed both to have a value of 1 for this field. So what’s going on here?

Digging into zipfile’s history

The most likely explanation I could think of, was some difference between my local version of the Python zipfile module and the latest published version on Github. Using Github’s blame view, I inspected the revision history of the part of the check that raises the error. This revealed a recent change to zipfile: prior to a patch that was submitted in May 2019, the offending check was done slightly differently:

if diskno != 0 or disks != 1:
  raise BadZipFile("zipfiles that span multiple disks are not supported")

Note that in the old situation the test would fail if disks was any value other than 1, whereas in the new situation it only fails if disks is greater than 1. Given that for our OneDrive file the value is 0, this explains why the old version results in the error. The Git commit of the patch also includes the following note:

Added support for ZIP files with disks set to 0. Such files are commonly created by builtin tools on Windows when use ZIP64 extension.

So could this be the vital clue we need to solve this little file format mystery? Re-running my Python test script with the latest version of the zipfile module did not result in any reported errors, so this looked hopeful for a start. But is the 0 value of “total number of disks” also the thing that makes unzip and 7-zip choke?

Hacking into the OneDrive ZIP file

To put this to the test, I first made a copy of the OneDrive ZIP file. I opened this file in a Hex editor, and did a search on the hexadecimal string 0x504b0607, which is the signature that indicates the start of the “zip64 end of central dir locator”³. I then changed the first byte of the “total number of disks” value (this is the 13th byte after the signature, indicated by the red rectangle in the screenshot) from 0x00 to 0x01:

This effectively sets the “total number of disks” value to 1 (unsigned little-endian 32-bit value). After saving the file, I repeated all my previous tests with unzip, 7-zip, as well as zip’s integrity check. The modified ZIP file passed all these tests without any problems! The contents of the file could be extracted normally, and the extraction is also complete. The file can also be opened normally in Linux Mint’s archive manager, as this screenshot shows:

So, it turns out that the cause of the problem is the value of one field in the “zip64 end of central dir locator”, which can be provisionally fixed by nothing more than changing one single bit!

Similarity to Apple Archive utility problem

The problem looks superficially similar to an older issue with Apple’s Archive utility, which would write corrupt ZIP archives for cases where the ZIP64 extension is needed. From the WikiPedia entry on ZIP:

Mac OS Sierra’s Archive Utility notably does not support ZIP64, and can create corrupt archives when ZIP64 would be required.

More details about this are available here, here and here. Interestingly a Twitter thread by Kieran O’Leary put me on track of this issue (which I hadn’t heard of before). It’s not clear to me if the OneDrive problem is identical or even related, but because of the similarities I thought it was at least worth a mention here.

Conclusion

The tests presented here demonstrate how large ZIP files exported from the Microsoft OneDrive web client cannot be read by widely-used tools such as unzip and 7-zip. The problem only occurs for large (> 4 GiB) files that use the ZIP64 extension. The cause of this interoperability problem is the value of the “total number of disks” field in the “zip64 end of central dir locator”. In the OneDrive files, this value is set to 0 (zero), whereas most reader tools expect a value of 1. It is debatable whether the OneDrive files violate the ZIP format specification, since the spec doesn’t say anything about the permitted values of this field. Affected files can be provisionally “fixed” by changing the first byte of the “total number of disks” field in a hex editor. However, to ensure that existing files that are affected by this issue remain accessible in the long term, we need a more structural and sustainable solution. It is probably fairly trivial to modify existing ZIP reader tools and libraries such as unzip and 7-zip to deal with these files. I’ll try to get in touch with the developers of some of these tools about this issue. Ideally things should also be fixed on Microsoft’s end. If any readers have contacts there, please bring this post to their attention!

Test file

I’ve created an openly-licensed test file that demonstrates the problem. It is available here:

https://zenodo.org/record/3715394

Update (17 March 2020)

For unzip I found this ticket on the Info-Zip issue tracker, which looks identical to the problem discussed in this post. The ticket was already created in 2013, but its current status is not entirely clear.

For 7-zip, things are slightly complicated by the fact that for Unix a separate p7zip port exists, which currently is 3 major releases behind the main 7-zip project. In any case, I’ve just opened this feature request in the p7zip issue tracker.

Meanwhile Andy Jackson has been trying to get this issue to the attention of Microsoft, so let’s see what happens from here.

Fix-OneDrive-Zip script (update 8 June 2020)

In the comments section, Paul Marquess posted a link to a small Perl script he wrote that automatically updates the “total number of disks” field of a problematic OneDrive ZIP file. The script is available here:

https://github.com/pmqs/Fix-OneDrive-Zip

I ran a quick test with my openly-licensed test file, using the following command:

fix-onedrive-zip onedrive-zip-test-zeros.zip

After running the script, the file was indeed perfectly readable. Thanks Paul!

Revision history

14 March 2020: added analysis with Python zipfile, and updated conclusions accordingly.
17 March 2020: added update with links to Info-Zip and p7zip issue trackers.
18 March 2020: added link to test file.
8 June 2020: added reference to Fix-OneDrive-Zip script by Paul Marquess.

For unzip you can check this this by running it with the --version switch. If the output includes ZIP64_SUPPORT this means ZIP64 is supported. ↩
Note that this is the big-endian representation of the signature, whereas the ZIP formation specification uses little-endian representations. See more on endianness here. ↩
Since the “zip64 end of central dir locator” is located near the end of the file, the quickest way to find it is to scroll to the very end of the file in the Hex editor, and then do a reverse search (“Find Previous”) from there. ↩

Comments

Post a comment by replying to this post using your ActivityPub (e.g. Mastodon) account.

matmat wrote (archived comment):

The corrupt zip files createad on macOS created by Finder.app, ditto, Archive utility etc. can in fact be opened by 7-zip v.17.01 beta onwards. Unfortunately only older versions are ported to platforms other than Windows. Maybe the newer versions of 7-zip for Windows are able to extract also the corrupted zip files created by OneDrive?

I do not know of any other utility outside of Apples own tools (Finder, ditto, Archive Utility) except recent versions of 7-zip for Windows to extract these kind of files. Does anyone know of another tool, preferably with a FLOSS license and natively runnig on Linux?

2020-03-12T20:09:52Z

matmat wrote (archived comment):

I also tried (and failed) to convince the info-zip developers to add support:

Date: Wed, 20 Jun 2018 07:48:28 +0200
Subject: Re: Info-ZIP Bug report [UnZip v. large files on Mac]
From: Mattias Mattsson
To: Info-ZIP-Dev@goatley.com

On Wed, Jun 20, 2018 at 6:05 AM, Steven M. Schweda <sms@antinode.info> wrote:

(...)

>    It might be possible to get UnZip to try to unravel the mess which
> results from using a small-file program on a large file, but without a
> very powerful argument (and/or a suitable patch), I'd rank it pretty low
> on the list of UnZip features to be added.

The problem is not so much in creating compliant zip files myself, but
that Macs and Apple tools are so widely used. Especially in the
professional video/cinema industry.

I work for a large film festival where we receive lots of submissions
compressed into multi-gigabyte zip files (10-150 GB) compressed with
Apple tools. We handle these files on Linux systems. It is really hard
to convince these people that they should not use the Apple Finder to
compress their files. Trust me, I have tried. Many times. "What? It
just works on my Mac!". I have also tried submitting feedback to Apple
to fix their tools.

Right now I am not aware of a tool to successfully uncompress these
defect files natively on Linux. What we do is run 7zip on Windows or
Apple's "ditto" on macOS in a VM. This is somewhat inconvenient and I
would love to be able to do this with Info-Zip directly.

I totally understand that this whole mess is Apple's fault. Although I
thinkt that to be able to handle these "corrupted" files would be a
really useful feature for lots of people. Unfortunately I am not
qualified myself to submit a patch for this.

I don't know if the above is enough of a powerful argument for adding
this feature, but thanks for considering!

Best,
-- 
-mattias

2020-03-12T20:32:35Z

bitsgalore wrote (archived comment):

@matmat Thanks, this is some really useful information. Don’t have access to any Windows machine right now (and because of the current COVID-19 outbreak I won’t for at least the coming 2 weeks), but I’ll give this a try later. (Side note: your comment made me realize the formatting of the comments section of my blog needs some work, will look at this later!)

2020-03-13T11:11:26Z
pmqs wrote (archived comment):

If preserving file permissions isn’t an issue when unzipping, you could try using jar. It can work in streaming-mode and read a zip file from stdin. That will bypass the problem with the incorrect value in the central directory (because it doesn’t read it).

Usage is

jar xvf /dev/stdin <file.zip

There are other programs that can unzip in streaming mode.

2020-03-17T14:33:58Z

matmat wrote (archived comment):

@pmqs Interesting!

Unfortunately it errors out with this for me testing on a 33G file created with macOS Finder.app:

$ jar xvf /dev/stdin < 33G_dir.zip 
  created: 33G_dir/
java.io.IOException: Push back buffer is full
	at java.base/java.io.PushbackInputStream.unread(PushbackInputStream.java:229)
	at java.base/java.util.zip.ZipInputStream.readEnd(ZipInputStream.java:359)
	at java.base/java.util.zip.ZipInputStream.read(ZipInputStream.java:197)
	at java.base/java.util.zip.ZipInputStream.closeEntry(ZipInputStream.java:141)
	at jdk.jartool/sun.tools.jar.Main.extractFile(Main.java:1457)
	at jdk.jartool/sun.tools.jar.Main.extract(Main.java:1364)
	at jdk.jartool/sun.tools.jar.Main.run(Main.java:409)
	at jdk.jartool/sun.tools.jar.Main.main(Main.java:1681)

Is there a way to increase the buffer? What other programs can unzip in streaming mode?

2020-03-19T03:33:19Z

pmqs wrote (archived comment):

@matmat That suggests to me that jar may support streaming for input, but output needs buffering. Not sure why it needs to do that (I have a streaming unzip in development and it doesn’t need to buffer data before outputting). I don’t use jar, so can’t really comment. No idea how to increase the buffer size.

I had a link that referenced some other streaming unzippers, but can’t find it. Try searching for the terms “stream unzip”

The only one I can remember offhand is https://github.com/madler/sunzip

2020-03-19T09:00:10Z

matmat wrote (archived comment):

@pmqs Thanks!

Tried sunzip on a Finder.app created zip-file of the same (~33G) size, which gives a similair error to what happens when you try to unzip these files with a non-stream unzip program (info-zip etc.):

$ sunzip < ../Dimmaforsvinna_SHR-1-25_F_sv-en_30_2K_20200114_SMPTE_OV.zip 
7 entries processed
Dimmaforsvinna_SHR-1-25_F_sv-en_30_2K_20200114_SMPTE_OV/ OK
Dimmaforsvinna_SHR-1-25_F_sv-en_30_2K_20200114_SMPTE_OV/j2c_48f5df27-c056-4300-aaba-a23d20f070bf.mxf OK
Dimmaforsvinna_SHR-1-25_F_sv-en_30_2K_20200114_SMPTE_OV/pcm_fcb2fd32-cd59-4f6c-a144-dd733da49c28.mxf
sunzip abort: zip file format error (local/central offsets mismatch)

To this day the only tool I have found working for these files outside of Apple’s own is 7-zip for Windows :(

2020-03-19T10:52:02Z

pmqs wrote (archived comment):

@matmat Are there any examples of the zip file available online anywhere. I can try my code against it & see what I get.

2020-03-19T13:17:52Z
bitsgalore wrote (archived comment):
@pmqs I’d come across the suggestion to use jar for this before, but when I try it gives me the following error:
```
java.util.zip.ZipException: only DEFLATED entries can have EXT descriptor
	at java.base/java.util.zip.ZipInputStream.readLOC(ZipInputStream.java:311)
	at java.base/java.util.zip.ZipInputStream.getNextEntry(ZipInputStream.java:123)
	at jdk.jartool/sun.tools.jar.Main.extract(Main.java:1361)
	at jdk.jartool/sun.tools.jar.Main.run(Main.java:409)
	at jdk.jartool/sun.tools.jar.Main.main(Main.java:1681)
```
Strangely the very same error also happens for smaller OneDrive files (that don’t use ZIP64), but it does work for ZIP files that I create myself. I suspect this error is unrelated to ZIP64, but related to the fact that the items in the OneDrive ZIPs aren’t flate-compressed.
2020-03-19T13:31:55Z
matmat wrote (archived comment):

@pmqs I am uploading a zip file to you for testing! Please do not distribute. Given some time I could probably produce a file that can be freely shared. Let me know if you need anything else.

This file extracts correctly with Apple tools and also with latest 7-zip on Windows.

2020-03-19T14:40:11Z
bitsgalore wrote (archived comment):

@matmat Getting back to your first comment: I can now confirm that the 20.0 alpha version of 7-zip for Windows is able to extract the OneDrive files. See also this ticket I opened for the p7zip port. Hopefully these changes will be ported over to p7zip. I also created an openly-licensed test file, available here (added the link to the blog post as well):

https://zenodo.org/record/3715394

2020-03-19T14:49:56Z
matmat wrote (archived comment):

@bitsgalore Hopefully! But the latest port version was for version 16.02 released almost four years ago: https://sourceforge.net/projects/p7zip/files/p7zip/ so I would not hold my breath.

2020-03-19T15:18:26Z
pmqs wrote (archived comment):

@pmqs I am uploading a zip file to you for testing! Please do not distribute. Given some time I could probably produce a file that can be freely shared. Let me know if you need anything else.

Thanks - give me a shout when that’s available & where it is uploaded to.

2020-03-19T19:33:31Z
pmqs wrote (archived comment):

@matat I downloaded the enormous zip file file you sent me. No luck though. I’m afraid my code is too immature to deal with it.

This archive uses none of the Zip64 extension that must be used when the fie is > 4Gig.

At the end of the file the End of central directory record is present as expected, but there is no Zip64 end of central directory locator record at all. Similarly the extra Zip64 fields that should be present in the central directory headers are not present

This is just a 32-bit zip file that has had 64-bit data shoe-horned into it. All the 32-bit fields have just overflown.

A seriously badly-formed zip file.

2020-03-23T22:50:16Z
snegel wrote (archived comment):

Johan, fifteen minutes ago I tried to unzip in Linux Mint a 4.2GiB file from OneDrive, and ran into the same problem. I read your post, changed one byte (actually one bit): problem solved! Thanks! Gertjan

2020-03-27T17:08:05Z
bitsgalore wrote (archived comment):

@snegel Thanks for reporting back, good to hear this is working for others as well!

2020-03-31T12:22:41Z
jhogervorst wrote (archived comment):

Thanks a lot, your trick solved the problem for me as well! 💯

2020-05-26T11:41:28Z
pmqs wrote (archived comment):

I created a small script to do the one byte update of these OneNote zip files. See Fix-OneDrive-Zip for more details.

2020-05-30T23:19:28Z
brunoicq wrote (archived comment):

We recently started using Microsoft OneDrive at work (by using Web Interface) and I have to upload 20-30Gb of big zipped files (2-5Gb each one) every day. I already tried to use a lot of tools (WinZIP, 7-ZIP, WinRAR, and JZip), and I always had this kind of issue (corrupted files)

I think the problem is the OneDrive service, cause when I upload these files to other servers (such as GDrive) I never have this kind of issue.

2020-06-05T20:38:21Z
bitsgalore wrote (archived comment):

@pmqs Thanks for posting the link to your script, this is really useful! I just added another update to my post in which I refer to it.

2020-06-08T11:21:55Z
bitsgalore wrote (archived comment):

@brunoicq Apart from the bad ZIP file problem OneDrive has various other issues that make it a complete pain to use. As an example, see the below Twitter thread on my attempt at uploading some 6 GB of data through the web interface:

https://twitter.com/bitsgalore/status/1240006157755977729

Apart from that the OneDrive web client also resets the time stamps (creation and last-modified date and time) on any uploaded files. From what I can gather both issues only occur when you use the web interface (the native Windows client seems to work pretty well).

2020-06-08T11:57:07Z
LndTown wrote (archived comment):

Hi there,

I’ve been trying to figure this issue out for months now - it’s beyond frustrating (same thing happens to me, when I download large files/folders from OneDrive or SharePoint, the zip is corrupt). However, I have no coding background and am just a regular Mac/PC end user that’s just been frustrated and searching on Google how to fix this and came across this thread.

With the script fix, how can I run this on my Mac? Step by step would be much appreciated.

Thank you so much!

2020-07-15T21:22:58Z
mattdig wrote (archived comment):

This was super helpful information. Thank you. But when I tried to run the fixer on a 20GB archive, I get

Error: Cannot find Zip signature at end of ‘OneDrive_1_8-19-2020.zip’

It’s an archive consisting of 25 video files, each about 1.5GB. It was generated by OneDrive when I selected all the files for download at once. Downloading them one at a time is going to be a pain but I might as well get started.

2020-08-19T13:16:30Z
pmqs wrote (archived comment):
This was super helpful information. Thank you. But when I tried to run the fixer on a 20GB archive, I get

Error: Cannot find Zip signature at end of ‘OneDrive_1_8-19-2020.zip’

It’s an archive consisting of 25 video files, each about 1.5GB. It was generated by OneDrive when I selected all the files for download at once. Downloading them one at a time is going to be a pain but I might as well get started.

Can you run zipdetails against the OneDrive file and post the results or add to a gist?
```
zipdetails -v 'OneDrive_1_8-19-2020.zip
```
Also, run perl -V (note that is uppercase V) and post the output.
2020-08-19T14:15:04Z
mattdig wrote (archived comment):

It tried zipdetails but after 12 hours I cancelled it. I’m going to try it again over night. Full output from perl -V Summary of my perl5 (revision 5 version 18 subversion 4) configuration:

Platform: osname=darwin, osvers=19.0, archname=darwin-thread-multi-2level uname=’darwin osx397.sd.apple.com 19.0 darwin kernel version 18.0.0: tue jul 9 11:12:08 pdt 2019; root:xnu-4903.201.2.100.7~1release_x86_64 x86_64 ‘ config_args=’-ds -e -Dprefix=/usr -Dccflags=-g -pipe -Dldflags= -Dman3ext=3pm -Duseithreads -Duseshrplib -Dinc_version_list=none -Dcc=cc’ hint=recommended, useposix=true, d_sigaction=define useithreads=define, usemultiplicity=define useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=define, use64bitall=define, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc=’cc’, ccflags =’ -g -pipe -fno-common -DPERL_DARWIN -fno-strict-aliasing -fstack-protector’, optimize=’-Os’, cppflags=’-g -pipe -fno-common -DPERL_DARWIN -fno-strict-aliasing -fstack-protector’ ccversion=’’, gccversion=’4.2.1 Compatible Apple LLVM 11.0.3 (clang-1103.0.29.20) (-macos10.15-objc-selector-opts)’, gccosandvers=’’ intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype=’long’, ivsize=8, nvtype=’double’, nvsize=8, Off_t=’off_t’, lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld=’cc’, ldflags =’ -fstack-protector’ libpth=/usr/lib /usr/local/lib libs= perllibs= libc=, so=dylib, useshrplib=true, libperl=libperl.dylib gnulibc_version=’’ Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=’ ‘ cccdlflags=’ ‘, lddlflags=’ -bundle -undefined dynamic_lookup -fstack-protector’

Characteristics of this binary (from libperl): Compile-time options: HAS_TIMES MULTIPLICITY PERLIO_LAYERS PERL_DONT_CREATE_GVSV PERL_HASH_FUNC_ONE_AT_A_TIME_HARD PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP PERL_PRESERVE_IVUV PERL_SAWAMPERSAND USE_64_BIT_ALL USE_64_BIT_INT USE_ITHREADS USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_PERLIO USE_PERL_ATOF USE_REENTRANT_API Locally applied patches: /Library/Perl/Updates/ comes before system perl directories installprivlib and installarchlib points to the Updates directory Built under darwin Compiled at Jun 5 2020 17:34:07 @INC: /Library/Perl/5.18/darwin-thread-multi-2level /Library/Perl/5.18 /Network/Library/Perl/5.18/darwin-thread-multi-2level /Network/Library/Perl/5.18 /Library/Perl/Updates/5.18.4 /System/Library/Perl/5.18/darwin-thread-multi-2level /System/Library/Perl/5.18 /System/Library/Perl/Extras/5.18/darwin-thread-multi-2level /System/Library/Perl/Extras/5.18

2020-08-21T21:22:36Z
pmqs wrote (archived comment):
It tried zipdetails but after 12 hours I cancelled it. I’m going to try it again over night.

It should run in seconds. If it is taking that long to run I suspect it is attempting to find the end of the Zip Central Directory record by scanning backwards from the end of the file. After 12 hours it really should have found it. That suggests it isn’t there.

Try running with the --scan option – that will get the program to work from the start of the file

zipdetails -v --scan 'OneDrive_1_8-19-2020.zip'

You should start to see output immediately. Given the size of the file it may take a while to run. The part we need to see is the END CENTRAL HEADER at the end - it looks like this
```
1400003A4 END CENTRAL HEADER    06054B50
1400003A8 Number of this disk   0000
1400003AA Central Dir Disk no   0000
1400003AC Entries in this disk  0005
1400003AE Total Entries         0005
1400003B0 Size of Central Dir   000001AC
1400003B4 Offset to Central Dir FFFFFFFF
1400003B8 Comment Length        0000
```
Full output from perl -V Summary of my perl5 (revision 5 version 18 subversion 4) configuration:

That looks fine.
2020-08-21T22:15:14Z
berndmeyer wrote (archived comment):

just fixed a 50GB zip from one drive with this in seconds. Thank-you SO much. What a lifesaver.

2020-11-26T03:20:15Z
stripydog wrote (archived comment):

Nice analysis Johan. I found this page only after having gone through the pain of diagnosing this problem for myself, in our case when we created huge zip files using windows “send to-> compressed (zipped) folder”. No joy from microsoft support. If anyone for any reason wants a Compiled/C solution, the utility for fixing MS zip files I wrote before finding pmqs’s script is at https://github.com/stripydog/fixmszip.git Should work on big and little endian systems (although I haven’t actually tested on big endian yet…)

2021-02-15T16:54:24Z
Fulkeg wrote (archived comment):

I am a total new guy to this so forgive the question. But how do I install the code to fix extraction errors?

2021-02-26T07:58:37Z
hellmi-pelmi wrote (archived comment):

Thank you very much indeed for the thorough analysis, and excellent documentation. I was running into the same issue with +4GiB downloads from OneDrive: I was using 7-zip version 16.x (64-bit), and while it worked fine for all OD downloads smaller than 4GiB, it balked at any that were larger along the lines of the problem description in your article. I have just installed 7-zip version 19.00 (also 64-bit), and that works just fine on exactly the same large OD downloads. I have also tried the “unzip” (x64) version that is currently available from the OpenSUSE Linux 15.2 update service, which equally works just fine with these larger files. So touching up the bits to make it pass may no longer be required.

2021-03-02T02:52:11Z
hellmi-pelmi wrote (archived comment):

I am a total new guy to this so forgive the question. But how do I install the code to fix extraction errors?

Take a look at the comment that I just added - you may no longer need this work-around.

2021-03-02T02:53:00Z
abhivij wrote (archived comment):

Just found from my own experience that there is this 4GB limit, had started downloading individual files, and saw this.

Thank you for the explanation @bitsgalore

@hellmi-pelmi unfortunately the latest version of 7z isn’t available for Linux. Will have to go with this work-around

2021-07-20T11:58:28Z

abhivij wrote (archived comment):

@pmqs The script unfortunately didn’t resolve the zip file issue.

I have a zip file of 20GB.

fix-onedrive-zip executed properly on the file, and now when I try to rerun I get :

./fix-onedrive-zip RNA.zip 

Checking 'RNA.zip'
Nothing to do: 'Total Number of Disks' field is already 1

However, the zip file is still marked as corrupt

7z l RNA.zip 

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,64 CPUs AMD EPYC 7502P 32-Core Processor                (830F10),ASM,AES-NI)

Scanning the drive for archives:
1 file, 20794162907 bytes (20 GiB)

Listing archive: RNA.zip

--
Path = RNA.zip
Type = zip
ERRORS:
Headers Error
Unconfirmed start of archive
WARNINGS:
There are data after the end of archive
Physical Size = 5467709299
Tail Size = 15326453608

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2021-07-20 12:25:00 .....   1365918368   1365918368  RNA/HB1_S56_L001_R1_001.fastq.gz
2021-07-20 12:26:00 .....   1030131302   1030131302  RNA/HB10_S60_L001_R1_001.fastq.gz
2021-07-20 12:26:00 .....   1046828338   1046828338  RNA/HB11_S21_L001_R1_001.fastq.gz
2021-07-20 12:26:00 .....    939749578    939749578  RNA/HB12_S48_L001_R1_001.fastq.gz
2021-07-20 12:27:00 .....            0   1085081299  RNA/HB13_S70_L002_R1_001.fastq.gz
------------------- ----- ------------ ------------  ------------------------
2021-07-20 12:27:00         4382627586   5467708885  5 files

Warnings: 1

Errors: 1

zipdetails -v RNA.zip 
zzz at /usr/bin/zipdetails line 1604.

I’m wondering if total number of disks field could be values greater than 1 too, I mean if file size

is 2 * 4GB - total number of disks field expected to be 1

if 3 * 4GB - then 2,

something like that

The reason I suspected this is because I tried the script on a smaller zip file of 4.1 GB and that worked fine

2021-07-21T02:25:21Z

pmqs wrote (archived comment):
@pmqs The script unfortunately didn’t resolve the zip file issue.

… ``` 7z l RNA.zip

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21 p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,64 CPUs AMD EPYC 7502P 32-Core Processor (830F10),ASM,AES-NI)

Scanning the drive for archives: 1 file, 20794162907 bytes (20 GiB)

Listing archive: RNA.zip

– Path = RNA.zip Type = zip ERRORS: Headers Error Unconfirmed start of archive WARNINGS: There are data after the end of archive Physical Size = 5467709299 Tail Size = 15326453608

Date Time Attr Size Compressed Name ——————- —– ———— ———— ———————— 2021-07-20 12:25:00 ….. 1365918368 1365918368 RNA/HB1_S56_L001_R1_001.fastq.gz 2021-07-20 12:26:00 ….. 1030131302 1030131302 RNA/HB10_S60_L001_R1_001.fastq.gz 2021-07-20 12:26:00 ….. 1046828338 1046828338 RNA/HB11_S21_L001_R1_001.fastq.gz 2021-07-20 12:26:00 ….. 939749578 939749578 RNA/HB12_S48_L001_R1_001.fastq.gz 2021-07-20 12:27:00 ….. 0 1085081299 RNA/HB13_S70_L002_R1_001.fastq.gz ——————- —– ———— ———— ———————— 2021-07-20 12:27:00 4382627586 5467708885 5 files

Warnings: 1

Errors: 1

That output suggests there is more wrong with your zip file than just the Total Number of Disks issue. Looks like more serious corruption. That could be mean the file has got corrupted while downloading, or it has the corruption cooked into the original file. Can you share any details on how the file was created?

Also, is RNA.zip available to download?

Suspect it won’t make any difference, but have you tried testing the zip file using unzip -t RNA.zip ?
zipdetails -v RNA.zip zzz at /usr/bin/zipdetails line 1604.
That is an old version of zipdetails you are running. Can you try with the latest version that is available here. If you still have a copy of RNA.zip before you ran fix-onedrive-zip can you run zipdetails with it.

If you get the error Cannot find 'Zip64 end of central directory record': 0x06054b50 can you run

zipdetails --scan -v RNA.zip

That may take a very long time to run, so you need to be patient.

I’m wondering if total number of disks field could be values greater than 1 too, I mean if file size

is 2 * 4GB - total number of disks field expected to be 1

if 3 * 4GB - then 2,

something like that

The reason I suspected this is because I tried the script on a smaller zip file of 4.1 GB and that worked fine

No - that field has nothing to do with the overall size of the zip file. It is a legacy feature that dates back to a time when you had to split zip files across multiple floppy disks.
2021-07-21T08:45:41Z

abhivij wrote (archived comment):

Thank you for your response @pmqs

The original files are not corrupt. The directory contained 121 fastq.gz files and 1 md5sum text file. Since I was left with no other choice, I downloaded each of the files individually, verified it with the md5sum file and it matched.

I had tried downloading different combinations of subsets of these 121 files - those less than total zip file size of 4gb unzipped properly - others in range of 4gb to 8gb unzipped properly after running your script. So the original files are not corrupt.

Sadly, I’m not allowed to share the RNA.zip file publicly.

output of unzip added below

unzip -t RNA.zip 
Archive:  RNA.zip
error: End-of-centdir-64 signature not where expected (prepended bytes?)
  (attempting to process anyway)
warning [RNA.zip]:  831651840 extra bytes at beginning or within zipfile
  (attempting to process anyway)
file #1:  bad zipfile offset (local header sig):  831651840
  (attempting to re-compensate)
    testing: RNA/HB1_S56_L001_R1_001.fastq.gz   OK
    testing: RNA/HB10_S60_L001_R1_001.fastq.gz   OK
    testing: RNA/HB11_S21_L001_R1_001.fastq.gz   OK
    testing: RNA/HB12_S48_L001_R1_001.fastq.gz   OK
    testing: RNA/HB13_S70_L002_R1_001.fastq.gz   OK
    testing: RNA/HB14_S92_L002_R1_001.fastq.gz   OK
    testing: RNA/HB15_S73_L002_R1_001.fastq.gz   OK
    testing: RNA/HB16_S109_L002_R1_001.fastq.gz   OK
    testing: RNA/HB17_S120_L002_R1_001.fastq.gz   OK
    testing: RNA/HB18_S90_L002_R1_001.fastq.gz   OK
    testing: RNA/HB19_S72_L002_R1_001.fastq.gz   OK
file #12:  bad zipfile offset (local header sig):  14837494612
  (attempting to re-compensate)
    testing: __RNA/HB2_S18_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB20_S107_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB21_S105_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB22_S101_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB23_S121_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB24_S87_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB25_S111_L002_R1_001.fastq.gz_Error.txt   OK
    testing: RNA/HB26_S80_L002_R1_001.fastq.gz   OK
    testing: RNA/HB27_S75_L002_R1_001.fastq.gz   OK
    testing: RNA/HB28_S76_L002_R1_001.fastq.gz   OK
    testing: RNA/HB29_S99_L002_R1_001.fastq.gz   OK
    testing: __RNA/HB3_S10_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB30_S89_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB31_S96_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB32_S84_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB33_S103_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB34_S112_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB35_S113_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB36_S100_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB37_S102_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB38_S108_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB39_S104_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB4_S7_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB40_S94_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB41_S63_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB42_S114_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB43_S67_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB44_S93_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB45_S68_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB46_S118_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB47_S71_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB49_S110_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB5_S42_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB50_S66_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB51_S106_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB52_S88_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB53_S65_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB54_S86_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB55_S115_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB56_S119_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB57_S85_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB58_S116_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB59_S79_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB60_S95_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB61_S64_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB62_S74_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB63_S117_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB64_S78_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB65_S97_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB66_S98_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB67_S91_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB68_S69_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB69_S82_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB7_S54_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB70_S81_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB71_S83_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB72_S12_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB73_S20_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB74_S43_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB75_S53_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB76_S34_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB77_S26_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB78_S50_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB79_S4_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB8_S62_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB80_S58_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB81_S39_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HB9_S77_L002_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC1_S55_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC10_S45_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC11_S11_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC12_S57_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC13_S15_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC14_S8_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC15_S28_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC16_S47_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC17_S52_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC18_S27_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC19_S1_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC2_S44_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC20_S5_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC21_S36_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC22_S13_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC4_S49_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC5_S29_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC6_S30_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC7_S2_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC8_S38_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/HC9_S22_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET1_S46_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET10_S9_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET11_S51_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET12_S35_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET13_S25_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET14_S31_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET15_S14_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET16_S32_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET17_S3_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET18_S61_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET19_S16_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET2_S6_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET20_S33_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET21_S37_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET3_S19_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET4_S23_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET5_S17_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET6_S40_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET7_S24_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET8_S59_L001_R1_001.fastq.gz_Error.txt   OK
    testing: __RNA/MET9_S41_L001_R1_001.fastq.gz_Error.txt   OK
    testing: RNA/TUZ8978_md5sum.txt   OK
    testing: ___All_Errors.txt        OK
At least one error was detected in RNA.zip.

I will try with latest zipdetails

2021-07-23T02:14:56Z

mjon wrote (archived comment):

Thanks so much for posting this. I was tearing my hair out trying to figure out why I couldn’t open a batch of 9 GB files I’d downloaded from a colleague’s OneDrive. My Mac refused to open them (“Error 79- inappropriate file type or format”) until I tried the Fix-OneDrive-Zip solution from @pmqs, then voila, they worked!

2021-08-13T22:38:30Z
nlflint wrote (archived comment):

That pearl script is a life saver! thanks!

2022-07-08T06:19:04Z
gergilar wrote (archived comment):

Incredibly this is still a problem in September 2022. Thanks a lot @pmqs and agree, that pearl script is awesome!

2022-09-13T19:42:00Z

Does Microsoft OneDrive export large ZIP files that are corrupt?

Extracting with Unzip

Extracting with 7-zip

4 GiB size limit and ZIP64

Small OneDrive ZIP and home-rolled large ZIP

Testing the ZIP file integrity

Tests with Python’s zipfile module

Digging into zipfile’s history

Hacking into the OneDrive ZIP file

Other reports on this problem

Similarity to Apple Archive utility problem

Conclusion

Test file

Update (17 March 2020)

Fix-OneDrive-Zip script (update 8 June 2020)

Revision history

Comments

About

Search

Tags

Archive

June

May

April

December

November

October

March

June

May

March

February

January

November

June

April

March

September

February

September

June

April

March

February

September

April

March

January

July

April

July

June

April

January

December

April

March

December

November

October

July

April

March

January

December

November

October

September

August

January

October

September

August

July

May

April

January

December

September

August