Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alignment Failed error with Given Example #36

Open
prashantserai opened this issue May 30, 2018 · 11 comments
Open

Alignment Failed error with Given Example #36

prashantserai opened this issue May 30, 2018 · 11 comments

Comments

@prashantserai
Copy link

prashantserai commented May 30, 2018

Hi!

After some efforts (tweaking Makefiles for Phonetisaurus and MITLM, amongst others), I managed to install Phonetisaurus, but it gives me the following error when trying to run the example from the README.

$ phonetisaurus-train --lexicon cmudict.formatted.dict --seq2_del
INFO:phonetisaurus-train:2018-05-30 17:28:00: Checking command configuration...
INFO:phonetisaurus-train:2018-05-30 17:28:00: Checking lexicon for reserved characters: '}', '|', '_'...
INFO:phonetisaurus-train:2018-05-30 17:28:00: Aligning lexicon...
ERROR:phonetisaurus-train:2018-05-30 17:28:00: Alignment failed. Exiting.

Any ideas what I could do?

@AdolfVonKleist
Copy link
Owner

What platform was this, and what version of OpenFst? Could you also post the first few lines of the input lexicon.

Also the python script you ran is just a wrapper around the c++ binaries. What happens if you just run the aligner:

phonetisaurus-align --input=cmudict.formatted.dict \
  --ofile=cmudict.formatted.corpus --seq1_del=false

I just downloaded everything and recompiled it from scratch on my MacBook (OSX 10.12.6, OpenFst 1.6.2) and it seemed to go OK.

@prashantserai
Copy link
Author

prashantserai commented May 31, 2018

The OpenFst version is 1.6.3 and the OS is RHEL Server release 6.9 (Santiago).

The command you tried does work for me too. Took 12 iterations or so.

What fails is phonetisaurus-train (with seq2_del or seq1_del)
phonetisaurus-train --lexicon cmudict.formatted.dict --seq2_del

The first few lines of my cmudict.formatted.dict are:

'bout B AW1 T
'cause K AH0 Z
'course K AO1 R S
'cuse K Y UW1 Z
'em AH0 M
'frisco F R IH1 S K OW0
'gain G EH1 N
'kay K EY1
'm AH0 M
'n AH0 N
'round R AW1 N D
's EH1 S
'til T IH1 L
'tis T IH1 Z
'twas T W AH1 Z
a AH0
a EY1
a's EY1 Z
a. EY1
a.'s EY1 Z
a.d. EY2 D IY1
a.m. EY2 EH1 M
a.s EY1 Z
aaa T R IH2 P AH0 L EY1
aaberg AA1 B ER0 G

@prashantserai
Copy link
Author

@AdolfVonKleist you sure meant to close this?

@wael34218
Copy link

@prashantserai I had the same problem when installing OpenFST 1.6.8. After I uninstalled everything and reinstalled using OpenFST 1.6.2 and it worked fine.

@AdolfVonKleist
Copy link
Owner

@prashantserai sorry I did not get the notification with your setup details. @wael34218 were you also running RHEL 6.9? The only OpenFst and OS combinations currently running on TravisCI are those described in the config file:

I'll try to find some time this week to upgrade OpenFst to 1.6.8, but that will still only cover the existing OSX and Ubuntu 14.04 platform builds. @prashantserai if you can contribute a RHEL configuration addon for the TravisCI yaml that would certainly be welcomed.

@AdolfVonKleist AdolfVonKleist reopened this Jul 2, 2018
@wael34218
Copy link

@AdolfVonKleist I am using Ubuntu 16.04

@prashantserai
Copy link
Author

prashantserai commented Sep 8, 2018

FYI I did try changing from OpenFST 1.6.3 to 1.6.2 too, but the problem from the original post persisted for me.

I did a verbose log:

[serai@zirconium example]$ phonetisaurus-train --lexicon cmudict.formatted.dict --seq2_del --verbose
INFO:phonetisaurus-train:2018-09-07 23:12:56: Checking command configuration...
INFO:phonetisaurus-train:2018-09-07 23:12:56: Checking lexicon for reserved characters: '}', '|', '_'...
DEBUG:phonetisaurus-train:2018-09-07 23:12:57: arpa_path: train/model.o8.arpa
DEBUG:phonetisaurus-train:2018-09-07 23:12:57: corpus_path: train/model.corpus
DEBUG:phonetisaurus-train:2018-09-07 23:12:57: dir_prefix: train
DEBUG:phonetisaurus-train:2018-09-07 23:12:57: grow: False
DEBUG:phonetisaurus-train:2018-09-07 23:12:57: lexicon_file: cmudict.formatted.dict
DEBUG:phonetisaurus-train:2018-09-07 23:12:57: logger: <logging.Logger instance at 0xefa908>
DEBUG:phonetisaurus-train:2018-09-07 23:12:57: makeJointNgramCommand: <bound method G2PModelTrainer._mitlm of <main.G2PModelTrainer instance at 0xefaa70>>
DEBUG:phonetisaurus-train:2018-09-07 23:12:57: model_path: train/model.fst
DEBUG:phonetisaurus-train:2018-09-07 23:12:57: model_prefix: model
DEBUG:phonetisaurus-train:2018-09-07 23:12:57: ngram_order: 8
DEBUG:phonetisaurus-train:2018-09-07 23:12:57: seq1_del: False
DEBUG:phonetisaurus-train:2018-09-07 23:12:57: seq1_max: 2
DEBUG:phonetisaurus-train:2018-09-07 23:12:57: seq2_del: True
DEBUG:phonetisaurus-train:2018-09-07 23:12:57: seq2_max: 2
DEBUG:phonetisaurus-train:2018-09-07 23:12:57: verbose: True
DEBUG:phonetisaurus-train:2018-09-07 23:12:57: phonetisaurus-align --input=cmudict.formatted.dict --ofile=train/model.corpus --seq1_del=false --seq2_del=true --seq1_max=2 --seq2_max=2 --grow=false
INFO:phonetisaurus-train:2018-09-07 23:12:57: Aligning lexicon...
FATAL: SetFlags: Bad option: --grow=false
ERROR:phonetisaurus-train:2018-09-07 23:12:57: Alignment failed. Exiting.

@prashantserai
Copy link
Author

prashantserai commented Sep 8, 2018

I was able to make the recipe in README.md work finally with the following changes after installation (needed a couple of separate hacks for installation i.e. config and build):

phonetisaurus-train, commenting out lines 191-194 as below to conquer runtime error

command = [
            "phonetisaurus-align",
            "--input={0}".format (self.lexicon_file),
            "--ofile={0}".format (self.corpus_path),
            "--seq1_del={0}".format (str (self.seq1_del).lower ()),
            #"--seq2_del={0}".format (str (self.seq2_del).lower ()), #line191
            #"--seq1_max={0}".format (str (self.seq1_max)), #line 192
            #"--seq2_max={0}".format (str (self.seq2_max)), #line193
            #"--grow={0}".format (str (self.grow).lower ()) #line194
        ]

phonetisaurus-apply, line 320 change to conquer syntax error
Original Code (spacing characters could be off viz space,\t,\n etc.)

tester = G2PModelTester (
args.model, 
**{key:val for key,val in args.__dict__.iteritems ()
if not key in ["model","word_list"] }
)

Modified to:

    tempdict={}
    for key,val in args.__dict__.iteritems():
        if not key in ["model", "word_list"]:
            tempdict[key]=val
    tester = G2PModelTester (args.model,**tempdict)

Don't know why these issues are exclusively on my system cos this is Python code which I thought should've been largely Platform independent. Anyway, hope this helps.

PS: The hacks used to build were:

  1. Added "-lrt" to the LIBS in Makefile for Phonetisaurus
  2. Comment out couple lines in the configure.ac file for MITLM as per this suggestion

@AdolfVonKleist
Copy link
Owner

Interesting. Thanks for the update. What python version/environment are you running on, also can you confirm the OS/build? I'm surprised that comprehension does not work. I would definitely like to sort out all the python issues; as you say that code should really be platform independent, but I currently do not know how to replicate these issues on my side.

@prashantserai
Copy link
Author

Hi! this is the info below, I guess it's an old python version

[serai@zirconium ~]$ python
Python 2.6.6 (r266:84292, Aug 9 2016, 06:11:56)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-17)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

@solonj
Copy link

solonj commented May 14, 2019

To offer another data point, I also had the same error. For me simply commenting out the 4 lines in the phonetisaurus-train "makeAlignerCommand" function, per @prashantserai above, was all I had to do to move forward.
Right now I am on Amazon Linux 2, OpenFST 1.7.2.
Python 2.7.14 (default, Jul 26 2018, 19:59:38)
[GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants