Is Phrase your favorite TMS? In this tutorial, you’ll learn how to import a complex JSON file in Phrase TMS with our instructor Carlos.
Carlos García Gómez
Hey everyone, and welcome to a new chapter of these video series. Today we’re going to start off with a new tool that many of you have been requesting, which is Phrase TMS. So, let’s get started. Alright, so we’re going to start off by creating the project in phrase TMS. And then we’re going to have a look at the JSON file that we’re going to work with, and that we’re going to prepare in this video. So, here in phrase TMS, let’s click on New Project. And when creating a new project, you will see a lot of fields, but we’re not going to use all of them here, I’m not going to explain all of these here, I’m just going to go straight to the point. So when we create a project, we basically need to specify a name. So let’s say for example, phrase TMS tutorial, a source language. So in this case, I’m gonna say en us, for English, United States, a target language, my case, es es for Spanish of Spain. And then that it at least for the required fields, something that I also like to do is to disable the machine translation engine is to disable the MT engine, which is enabled by default. Okay, at least for this test. And also for the completed filename and export path. We’re not going to see these in this tutorial, but later on in future tutorials for praise TMS, we’re going to see the result. So by default, we have that the exported files are going to be named with the original file name, then a hash, then the source language, the target language, the workflow and the status. So I would like to simplify this part. And I would use like to keep the file name, then an underscore, and then use the target language, I’m going to remove all of the other fields. So if we are preparing messages dot json file, the exported file is going to be named like messages underscore, E, S, E, S, or Spanish of Spain, dot JSON. Okay, that’s just more simplified, we can click on Create Project.
And here we can see that we have created the phrase TMS Tutorial Project. And we don’t have any jobs for now. So let’s go to notepad plus plus. And let’s have a look at the JSON file that we’re going to prepare. Right? If you are not familiar with the JSON file structure, JSON stands for JavaScript Object Notation. And it’s basically a series of keys and values. Okay, so in this case, we have the notification that the key, a key and a value is separated with a column. Okay? That’s the column. And then the value in this case is these old block. As you can see, it’s indented, and we have the curly brackets, but it doesn’t actually need to be indented. And the value of a specific key doesn’t need to be in curly brackets, for example, and we’re going to see these later on. The next one is enrollment. And this is the child of modifications. Enrollment is the key column to separate the key with the corresponding value, which again, is found in curly brackets. But now as you can see, we have another key title, which again, is a child of enrollment, and these key has the corresponding value. And this value is just a string for translation. Alright, the next key is message. And we have another string for translation. And as you can imagine, inside the strings for translation, and when we prepare any file, in this case, our JSON file, we need to identify first what we need to translate or what we need to extract for translation.
And then we need to see if there is some embedded content. Okay, in these lines, or in these strings, we can see this student placeholder and we know it’s a placeholder because it’s a rounded by curly bracket, the opening curly bracket and the closing curly bracket. Right. The same happens here, for course title, it’s surrounded by curly brackets. And for field, these are going to be placeholders that we will need to protect them later on in phrase TMS. And that’s not going to be included for translation, because we don’t want dealing with to modify these placeholders. However, next, we have a context underscore d N T. If you see any dn T in any source file, that means do not translate. So as we can imagine this sentence, even though it looks in English, it’s in English, and it looks like it could be translatable. Depending on the corresponding key. It might not need to be translated. Okay, so in this case, it’s just for context. And the source. The client is actually asking us to not not translate these strings. Okay, so we’re going to see later on how we can exclude them These strings from translation, but keeping this as our command, or these keys and values. Next we have another key which is meta, like the metadata. And then inside we have code, enrollment, 001, and type enrollment notification. As you can imagine, these strings are not going to be translated. So we will need to find a way to exclude these elements from translation. And the same happens down here that we have completion and some other key value pairs. And same here courses,
we have some other keys, for example, the description. It wasn’t here before we had title message. Here, we have title description, we still have the context, the entity, then instructor and then meta. As you can see, most of the patterns are repeated in the file. And we will need to know what to include and what to exclude from translation. So now after we have seen the file, we will go to phrase TMS and we’re going to create a new job. Alright, so we can click on new here under the jobs section. And let me just choose the file. And we’re going to specify messages dot JSON. And down here, this is the place actually where we are going to modify the filters or the parsers. Depending on the file format that we’re dealing with this case, we’re preparing a JSON file, so we need to find the JSON filter.
Now, here, you’re going to see a series of different fields when preparing our file, a JSON file for translation. And something really interesting is that in the support.array.com, here is the documentation. So if you Google, for example, phrase TMS JSON filter, you’re going to find this article or this documentation page, it’s very useful, because in the Import Options, you’re going to see an explanation of all of these fields that we have when preparing notation file for translation. Okay, but here, I’m going to explain all of these to you. So the first one is parsed ICU messages. And this is just the international components for Unicode. It’s basically things like pluralization support in JSON files, or date and time format, which might be different, you know, these kinds of internationalization things could be included in a JSON file. And this kind of message format could be extracted with this phrase TMS filter, okay, by default, it’s disabled. And in fact, in our JSON file, we don’t have any ICU messages. But this is something to take into account. And below this video, I’m actually going to also add a couple of interesting articles about ICU messages that you will find probably interesting to read. Next, we have the use hates emails to filter. As you can see, it’s enabled by default. And this means that if we have any HTML content in the strings, so let’s imagine that we have a bold tag here, or the opening and a bold tag here for the closing tag, or in a break tax break line tags, whatever. Okay, those kinds of tags are going to be protected with this HTML soak filter, which is basically doing in the background. It’s using the HTML filter, and it’s applying that filter to the messages in JSON. Next, we have the Convert to phrase DMS tags, okay. And this is the place where we’re going to specify the embedded content or the placeholders. Alright, here we have our reg x p module. And we can let me actually open this in the same window here. So reg x p TMS, if you scroll down a little bit, you’re going to find some examples here, okay? And it’s basically about using regular expressions when locking the embedded content or the placeholders. Alright, in this case, what we are going to do is to protect the placeholders like a student for title, field, or even down here we have instructor name. As you can see, we need to identify a pattern for these placeholders. And all of them are starting with an opening curly bracket and finishing with a closing curly bracket. So what we can do, and something very important to do in phrase TMS, is that when using a literal fully bracket, we need to escape it. That’s the reason why I have added the backslash just before okay, if we don’t escape it, it’s going to be valid, a valid regular expression in notepad plus plus and in some other tools, but not in phrase PMS, it will say that the regular expression is invalid. So make sure that you also add the backslash, both for the opening curly bracket For the closing curly bracket, and here, what we’re going to do is to use a shorthand connector class, which is the backslash, W. This means any letter, any digit, or any underscore, and then with the Qualifier Plus, we’re going to say one or more time, we can test this into class. First, we can click on Find Next. And we will see that the placeholder has been highlighted. And the same for the next one. Okay, course title field, and even find all we will see, all of them have been highlighted. If you need to add another rule here. Because this is not enough, you might need to do you have any other pattern in your file, you can add a pipe or a vertical line.
And then after that, you can add whatever you want. So if this is a regular expression, it will use both the one on the right or the one on the left. And if you want to add two more, you can add more pipes, more vertical lines or pipelines. And you can just separate it regular expression like this. Okay, in this file, we it’s enough with only this regular expression to protect the content. So let’s move on to the next field. Here we have import specific keys only using Drupal expressions or exclude specific keys using regular expression. This is something that we need to decide depending on the source file. So for example, if most of the keys are not translatable in your source file, and then only one key is to be translated, then it’s better to specify which key you need to translate only. However, if most of the keys are to be translatable, but only one two or three are to be excluded, then it’s better off to use the exclude specific keys field. And here, we are going to use regular expressions. And here for REG x p, we have the general example. But if you scroll down, you’re going to see txt input, which we will be using in the next tutorial. And here JSON input, we have an example of JSON file with some examples of different regular expressions to exclude or include some specific elements. So my specific keys, however, I’m going to explain this to you. But just take into account this page, because it’s really useful if you want to learn more about this. So going back here, we can actually copy and paste the regex example.
Okay, this is just an example. And this means that in the JSON file, we will be excluding because remember that we are under the Exclude field, we are excluding any key, which is text bound anywhere in the file. Okay, so in this case, and this is regular expression, it’s not JSON path rules, those are very different. This is why it’s using a period and then an asterisk, because it’s anything that we have before, then the forward slash, because that’s separating the parents with the children. So for example, we have notifications, or was last enrollment, this is just the way that we go from a parent to a child. Alright, so by saying this, we are simply saying that we want to exclude any text key. In our case, we’re going to exclude three different keys, context and DND. Code and type. So let’s start with code, for example. And we would replace it here. Now, the thing is that when we prepare something like this, in this case, it’s going to work fine. But imagine that you also have code or you also have context TNT at first. Okay, so this is just a nested key value pair, because we have said that we can have anything beforehand any parent before. And we have code, type, etc. However, if we have code like this, okay, like this code, and then colon, and then we have some dummy text, this code would not be excluded because it appears at the very beginning of the file. Okay, its modifications, its code and its courses is like the parent key in the JSON file.
With this way, we would only be excluding only the code keys that appear inside the blocks, but not outside, not at the very beginning. Okay, that’s the reason why it’s very useful to use these kinds of things. So I’m going to encapsulate all of this part in group with round brackets and then a pipe and then just before the caret symbol. This means that we can have either the beginning of the line or the beginning of the string, or we can have this element found inside any other key value pair, okay, it’s either at the very beginning or in a nested element. That’s why it’s useful to use this caret symbol. Next, in this case, we’re only excluding code. But we also need to exclude type, and context d and t. So what we can do is to use another group here. And with the pipes, we’re going to use code context, the N, T, and type. Okay, so the three keys that we need to exclude from translation, we are included including this in round brackets, because we want to exclude either code, or context, the NT, or type, okay, that’s why we are using the pipes. And that’s why we are using the round bracket. So that’s all for the exclude specific keys, we are just excluding these keys from the file. Now, for context note, let me actually copy this part. This is not regular expressions, this is JSON path rules. And it means that anywhere in the JSON where we have these key, okay, in this case, for the sample, it’s comment, but we’re going to use context B and D. Whenever we have these key, it’s going to be used as a reference or as a comment, or as a note, as it has here in TRACE DMS. So we are saying that context, the NT is not going to be translated. And anywhere where we find the context dn T, that’s going to be used as the context.
So for example, if we have context end here, this string is not going to be included for translation, but it’s going to be used as a note or as a context for any element or any key value pair, which is at the same level. So when the linguist is translating these strings, or these strings, they will see the corresponding context and they will know that they are translating something that is notification sent to students when enrolling in a course. And the same happens down here, for example, when the linguist is translating this title, and this description, they will have this specific context or nope, for reference, okay, because the context end is the very same key that we’re adding here. Now, we have the maximum target length. And this is something that that you could have in your source file, but he’s not actually really useful, or really frequent, I should say, in in JSON. Okay. So for example, let’s imagine that down here, we have something like max length, and then we have 25. Let’s imagine that this is just an example, we would be saying, if we add this same expression, we will be saying that only 25 characters can appear in this string and in this string, okay, that’s why it’s affecting all of these strings is not going to be something frequent. But if you need to do it, you can do it with this functionality of the JSON filter. And lastly, the context key. Now, this is something that you can customize for, I wouldn’t recommend it in JSON, because the key is going to be automatically extracted by the filter, okay, so, when the linguist is translating high student, they are going to see the context, which is this one. And they’re also going to see the corresponding key and the key is title that’s going to be used as the identifier or the string. The same happens for you have successfully enrolled in the course title course, when the linguist is translating this string, they’re going to see automatically this corresponding key, because this is the key associated to this value.
Okay, so we can leave this empty and the feeder is going to handle that for us. So that should be all we have converted to phrase DMS tags, these placeholders, we are going to exclude the specific keys, and make sure that you also add the forecast last inside the group here, because otherwise, it’s not going to work as we want it to work is going to basically exclude any key which appear at the beginning, which appears at the very beginning or inside a nested key value pair. And only for code context, the end and type and the context note is going to be context the end. So we have uploaded the file, messages dot JSON, we can click on Create and see what’s the result. So here we have our first job. When we have the messages dot JSON. If we click on this file, we’re going to open the phrase TMS editor and hopefully everything is okay. Alright, so here we have all of the translation units that we had extract it. And it looks pretty nice. So the segmentation looks correct. And as you can see, for example, things like enrollment 001, or enrollment notification, we can’t see any of these strings here, okay, because we have correctly exclude those from translation. Now, something interesting is that here, you will easily see the tags, if you hover your mouse over the tags, you will see Student Okay. In here, you’re going to see course title, then field, etc. Okay. And something really useful with HTML is that if you click on this string, for example, it will go under the Preview tab, it will go straight to the point where we have the corresponding string, or if you click on this part, it will go straight away to the translation unit that you need to translate. Now in the preview, Team block, you’re going to see the content that is not going to the text that is not going to be included for translation, for example, this context TNT, or even the course lpm, 001, localization product manager that’s in black. So it’s not going to be included for translation. And in fact, you can’t see it in the translation units. And in the great color, you’re going to see the strings that are included for translation. So this is kind of what you see is what you get in a text file, in this case, a JSON file.
We have another tab, which is the context note. And in here, let me just explain to you so let’s click on the first translation unit. So in the first translation unit, we have Hi, and then student. So we have two different things. The first thing is the key, which is the one that I told you before, that is automatically extracted by the filter. And it says notifications enrollment title. This is because for high student, which is the first thing that we’re going to translate, we have notifications enrollment title, okay, this is going to act as the parent and child way of identifying the key. For example, welcome to localization Academy should have the same key, which is titled, and in fact, if we click on this second translation unit, we’re going to see the same key. If we click on the next one, we’re going to see that it changes to notifications enrollment message, because the corresponding translation unit for this string is now message. And lastly, you will notice that we have a note here. So in this case, it says notification sent to to a student when enrolling in a course. And we this is associated to this string. So when we see this strength, The Linguist is going to also see the corresponding context, which is found under context, underscore, T and T. This is not going to be included for translation. But it’s very useful for the linguist to have this as a reference. Down here, for example. So in go from zero to engineering in various types of projects, we can see the corresponding key. So in this case, it’s found here to go from zero to engineering various types of projects is the second course to in this case, under courses, we have the first block, and then we have the second block.
That’s why it says forces and then with square brackets, number two, and then the Scripture because the key description is the key for this string. What’s the context? Well, the context should be content specific to localization engineering course. And that’s exactly what we can see here under note, okay, so the note is going to be used as a reference again for the linguist. So that’s pretty much all we have seen how to prepare a JSON file for translation, we have seen how to lock or protect the placeholders or any kind of embedded content in JSON, we have excluded some strings for translation. So we have said that the code, the type or the context are keys that are not going to be included for translation. And lastly, we have also seen how to use some specific keys as a comment or as a note for the linguist. Alright, so that’s all for this first tutorial of phrase TMS. I hope you have found it useful. In the next tutorial, you’re going to learn how to create a filter for text files, and you’re going to know how to identify and extract each translatable text for those text files.