Skip navigation

Solving the problem of large files

February 16, 2023

17a-4 LLC

Should we archive our meeting and webinar videos?   
Do we have to turn off file sharing in our chat platform?   
What do we do about all these large files our users are sharing? 

These are common questions we hear from clients.  Organizations want to allow their users to enable all the features of their chosen chat and collaboration platforms; particularly any feature that enables business.  And why not?  Isn’t the point of new technology to make our work easier?

Chat and collaboration platforms like Zoom, Slack, Teams and Webex all share a common problem when it comes to archiving users’ data – users share documents in various formats that can quickly become too big for mail and archive limits.  An Excel workbook gets shared and collaborated on and it grows and grows and the next thing you know, it’s too big to make it into the archive – we see this all the time.  Then there are all the video, meeting and webinar files that have little to no chance of making it in.

Exchange or Gmail limits can usually be avoided by sending the data into the archive system via SMTP or directory pickups depending upon the archive’s ingestion options.  The problem is most archive size limits are still not able to accommodate these large files.

If we use Zoom data as an example, a 1-hour audio file (not including video) from a Zoom Meeting is upwards of 55 MBs.  Most orgs will have an Exchange limit of 25 or 50 MBs.  If we add video, the MBs skyrocket and, even if an archive limit bumps to 100 MBs, video files are not making it in.

Of course, you can turn file sharing and deny video and webinar access, but will your users not be able to perform their jobs as well?  Corporate policies can cover compliance risks, but how will your organization beat competitors if employees can’t communicate with clients or colleagues as well.

A valid argument can be made to increase archive limits so these larger files and data types can flow in along with the associated chat data.  This is the best way to maintain the integrity of the data and facilitate contextual review in eDiscovery.  Further, this is the only way to ensure that review is complete with all attachments being indexed and searchable via the eDiscovery query being run against the data.  Not to mention, there is certainly something to be said for a unified archive.  It is the best and most efficient way to manage corporate data and simplify the application of legal, HR, IT, security, and compliance and disposition policies.

The other consideration that will (and should!) come up in this conversation is the cost of the storage.  Surely, there are more cost-effective options than an email archive for storage.  Email archives are loaded with features for Supervision, eDiscovery and Reporting requirements.  These features are more expensive than typical backup solutions, data lakes and blob storage systems.  You get what you pay for.

That brings us to the hybrid solution.  Chat data goes into the archive and large files are sent off to more cost-effective storage.  This does result in two locations for eDiscovery but, as long as the data in the lower cost storage can be associated with the rest of source data in the email archive, a complete review can be achieved.  We’ve designed the Azure Blob integration in DataParser to work for exactly this purpose.  Files can be sent to cold/warm/hot WORM storage in Azure Blob where they are easily ingested.  Storage costs are minimized, and regulatory compliance is achieved.

Which route is best typically comes down to understanding the amount and frequency of large files, users’ regulatory status and the business needs of the organization.  Once those are balanced, one of the above options can be deployed to manage large files and let users can share and collaborate using any feature that helps them work better.