Add hi_en Code Switched#415
Add hi_en Code Switched#415RajanPutty wants to merge 2 commits intoNVIDIA:staging/hi_en_itn_codeswitchedfrom
Conversation
Signed-off-by: RajanPutty <rputty@nvidia.com>
for more information, see https://pre-commit.ci
| @@ -0,0 +1,416 @@ | |||
| 10K ten k | |||
There was a problem hiding this comment.
do we need the data files here or can we just refer to the ones in the original languages? is there a difference between the two?
| @@ -0,0 +1,7 @@ | |||
| १/४ पाव | |||
There was a problem hiding this comment.
same question as for en_whitelist
| from nemo_text_processing.inverse_text_normalization.he.verbalizers.verbalize_final import ( | ||
| VerbalizeFinalFst, | ||
| ) | ||
| elif lang == 'ko': # Korean |
There was a problem hiding this comment.
let's rebase so we don't delete existing languages
| parser.add_argument( | ||
| "--lang", | ||
| help="language", | ||
| choices=["ar", "de", "en", "es", "es_en", "fr", "hi", "hy", "ko", "mr", "pt", "ru", "sv", "vi", "zh", 'ja'], |
There was a problem hiding this comment.
this should also get resolved with rebasing
| 'hy', | ||
| 'mr', | ||
| 'ja', | ||
| 'ko', |
There was a problem hiding this comment.
this should also get resolved with rebasing
| @@ -0,0 +1,30 @@ | |||
| दिल्ली एक एक शून्य शून्य शून्य एक~दिल्ली ११०००१ | |||
There was a problem hiding this comment.
let's add English address test cases too
| @@ -0,0 +1,39 @@ | |||
| आठ बटा तीन~८/३ | |||
There was a problem hiding this comment.
let's add English test cases too
| 'mr', | ||
| 'ja', | ||
| 'rw', | ||
| 'ko', |
There was a problem hiding this comment.
this should also get resolved with rebasing
| ClassifyFst as TNClassifyFst, | ||
| ) | ||
| from nemo_text_processing.text_normalization.rw.verbalizers.verbalize import VerbalizeFst as TNVerbalizeFst | ||
| elif args.language == 'ko': |
There was a problem hiding this comment.
this should also get resolved with rebasing
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
Before your PR is "Ready for review"
Pre checks:
git commit -sto sign.pytestor (if your machine does not have GPU)pytest --cpufrom the root folder (given you marked your test cases accordingly@pytest.mark.run_only_on('CPU')).bash tools/text_processing_deployment/export_grammars.sh --MODE=test ...pytestand Sparrowhawk here.__init__.pyfor every folder and subfolder, includingdatafolder which has .TSV files?Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.to all newly added Python files?Copyright 2015 and onwards Google, Inc.. See an example here.try import: ... except: ...) if not already done.PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.