Code at issue

with open(outputfile, "w") as file_handle:
    df2.to_csv(file_handle, compression='gzip')

Problem description

The written file is not gzip compressed.

Expected Output

The output should be a gzip compressed csv file. Similar to what is obtained when using:

df2.to_csv('/path/to/file.csv.gz',compression='gzip')

Comment From: WillAyd

Maybe related to #21144

Comment From: minggli

>>> import os
>>> import pandas as pd
>>> from pandas import *
>>>
>>> pd.__version__
'0.20.3'
>>>
>>> df = DataFrame(100 * [[123, 234, 435]])
>>>
>>> with open('test_compressed', 'w') as fh:
...     df.to_csv(fh, compression='gzip')
...
>>> fh_size = os.path.getsize('test_compressed')
>>> df.to_csv('test_compressed', compression='gzip')
>>> f_size = os.path.getsize('test_compressed')
>>>
>>> os.remove('test_compressed')
>>> assert fh_size == f_size
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError

looks like an existing behaviour dating back to version 0.20 or earlier.

Actually, the documentation of compression= says:

compression : string, optional

A string representing the compression to use in the output file. Allowed values are ‘gzip’, ‘bz2’, ‘zip’, ‘xz’. This input is only used when the first argument is a filename.

so it is not supported but may be a new use case perhaps?

Comment From: toninlg

Hi,

Should the following code works with this merge or is it related to #22555?

import os
import sys
import pandas as pd
from pandas import *

print(sys.version)
print(pd.__version__)
df = DataFrame(100 * [[123, 234, 435]])
with open('./test_compressed.gz', 'w', newline='') as fh:
    df.to_csv(fh)

fh_size = os.path.getsize('./test_compressed.gz')
df.to_csv('./test_compressed.gz')
f_size = os.path.getsize('./test_compressed.gz')
os.remove('./test_compressed.gz')
assert fh_size == f_size

3.6.7 (default, Dec 6 2019, 07:03:06) [MSC v.1900 64 bit (AMD64)] 0.25.1

3.7.7 (default, May 6 2020, 11:45:54) [MSC v.1916 64 bit (AMD64)] 1.0.5 I have an assertion error in both cases and if I add compression='gzip' to the first to_csv, I have RuntimeWarning: compression has no effect when passing file-like object as input.

Thank you

Comment From: AvivAvital2

@toninlg this worked for me

import gzip
with io.StringIO() as buf:
    df.to_csv(buf)
    with open('test_compressed.gz', 'wb') as remote_file:
        remote_file.write(gzip.compress(bytes(buf.getvalue(), 'utf-8')))

Comment From: miodeqqq

You can also try this:

import csv
import gzip
from io import BytesIO, TextIOWrapper

gz_buffer = BytesIO()

with gzip.GzipFile(fileobj=gz_buffer, mode="w") as gz_file:
    df.to_csv(
        path_or_buf=TextIOWrapper(gz_file, "utf8"),
        index=False,
        sep=",",
        quoting=csv.QUOTE_NONE,
        compression="gzip",
    )