プロジェクト

全般

プロフィール

Vote #78573

完了

Allow switching the encoding to UTF-8 when exporting to CSV

Admin Redmine さんが3年以上前に追加. 3年以上前に更新.

ステータス:
Closed
優先度:
通常
担当者:
-
カテゴリ:
Issues_2
対象バージョン:
開始日:
2022/05/09
期日:
進捗率:

0%

予定工数:
category_id:
2
version_id:
99
issue_org_id:
26279
author_id:
332
assigned_to_id:
332
comments:
19
status_id:
5
tracker_id:
2
plus1:
0
affected_version:
closed_on:
affected_version_id:
ステータス-->[Closed]

説明

In the current implementation, the encoding of exported CSV is fixed for each language and users cannot change it.

Sometimes it is problematic for teams using multiple languages. Suppose the situation that issues in a project are written in Japanese or Chinese (mixed language project; some issue are written in Japanese and some issues are written Chinese). If you want to export those issues to CSV, using UTF-8 as the CSV encoding is only solution to get readable (not garbled) CSV file. But actually the encoding is fixed to CP932 for Japanese users (source:tags/3.3.3/config/locales/ja.yml#L164) and gb18030 for Chinese users (source:tags/3.3.3/config/locales/zh.yml#L147). No one can get CSV file in UTF-8 without modifying source code of Redmine.

I think the problem can be resolved if users can override general_csv_encoding setting in CSV export options window like the following picture. The encoding in the drop-down list is defaults to general_csv_encoding and users can change to arbitrary encoding. We already have the similar drop-down in CSV import feature.

!{width: 600px;}.csv-export-options@2x.png!


journals

--------------------------------------------------------------------------------
Thank you for this suggestion, I think this would be a great improvement to the CSV export function.

I can corroborate the problem, we (at "Planio":https://plan.io) have had numerous customers in the last months, for example Russian users that use Planio/Redmine in English, which will have ??? in the export instead of Cyrillic characters because the export encoding for English does not support Cyrillic characters.
--------------------------------------------------------------------------------
Thanks, Felix. I want your advice.

At first I wrote that "users can change to arbitrary encoding". I thought that I can make a list of encodings from Setting::ENCODINGS constant. But now I think it is useless to display such a long list in the CSV export options and displaying only 2 choices, UTF-8 and the value of general_csv_encoding is enough. Because there are many incompatible encodings for each languages. For example, Japanese text is conpatible only with small number of encodings such as CP932, Shift_JIS, ISO-2022-JP, EUC-JP and so on.

What do you think about this? Displaying two choices (UTF-8 and general_csv_encoding) are enough?
--------------------------------------------------------------------------------
I am working on this issue now. I will post a patch soon.
--------------------------------------------------------------------------------
IMHO, this option isn't needed to show EVERYTIME.
If we can change the encoding on account or system setting, almost case is probably ok.
For example of bad case, the environment is mixed several languages, and Excel on Mac which cannot recognize utf-8 CSV on opening by default.
But in this case, it will not be solved with the option on CSV exporting dialog because it will need to export with utf-8.
Any other bad case when only change the default encoding?

--------------------------------------------------------------------------------
I tried writing patch with the specification that Go MAEDA was saying.( #26279#note-3 )
--------------------------------------------------------------------------------
select_encoding.patch failed to apply, so I fixed it.
--------------------------------------------------------------------------------
Mizuki Ishikawa, thanks for writing the patch. I tried out and it works fine on Issues and Spent time.

With the patch, we can override general_csv_encoding with UTF-8. "Encoding" drop-down is not displayed if general_csv_encoding for an user is UTF-8.
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
This is a common issue for Redmine installations using multiple languages from different language families like English with Japanese or German with Russian. since stored content might be mixed between the languages, it is generally very hard to find a common encoding for those. Forcing users to chose a Japanese locale in order to allow them to export Japanese text in their Redmine even if they usually use English is probably not the best user experience.

In addition to that, I understand that the current language-dependent encodings where chosen to match whatever the default encoding of Excel on Windows was for the respective language. However, this default doesn't even hold correct for other platforms. E.g. Excel 2016 for Mac with en-US locale defaults to a semicolon instead of a comma for the expected separator. In any case, both the separators as well as the encoding can be changed during opening of the file (File -> Import -> CSV file in Excel). This however requires that the exported CSV is saved with an encoding that is able to represent all of the characters in the fields.

The approach of allowing the selection of the default language and UTF-8 is in my opinion the right one. As such, I think the patch of Mizuki Ishikawa looks fine on first check since it allows to use the default encoding for the common case (as it is now) and also allows the use of UTF-8 for the more complex case. In any case, I think it's a good idea to show the user an indication of the encoding the file will be in so that they can configure their readers accordingly.

In the patch, I'd only add a check for a valid encoding in @lib/redmine/export/csv.rb@ to avoid an exception in case an invalid encoding was specified.
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Since the patch I wrote was old, it was fixed so that it can be applied to the latest trunk (17255).
--------------------------------------------------------------------------------
If unavailable encoding is selected, I have changed to use the default encoding.
--------------------------------------------------------------------------------
Slightly updated tests.

* Made test names simpler and clearer.
* Replaced the encoding for ':encoding' option with ISO-8859-3, the encoding which is not used for general_encoding_name in any config/locales/*.yml files.
--------------------------------------------------------------------------------
Committed. Thank you for improving Redmine.
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------

--------------------------------------------------------------------------------


related_issues

relates,Closed,27213,Euro symbol (€) is replaced by ? when performing a CSV export
relates,Closed,17902,CSV encoding should be UTF-8 in French locale
relates,Closed,27975,CSV export of different language
relates,Closed,31511,CSV export of time entries report does not honor project filter
relates,Closed,32641, "general_csv_encoding:" should be "UTF-8" in en.yml.

Admin Redmine さんが3年以上前に更新

  • カテゴリIssues_2 にセット
  • 対象バージョン4.0.0_99 にセット

他の形式にエクスポート: Atom PDF

いいね!0
いいね!0