2020年6月22日 星期一

Machine Learning Foundations NLP(2): Using the Sequencing APIs

Using the Sequencing APIs

Sequencing is use to format sentence array using token,
For example:

"I have a car"
"Car have a door"
 
Tokenize: [I:1] [have:2] [a:3] [car:4] [door:5]
 
then these two sentence can be represent to:
[1 2 3 4] 
[4 2 3 5]

Sequencing is useful to represent sentence data and take as input for a neuron network.

Code

Result:

Reference:






2020年6月17日 星期三

Machine Learning Foundations NLP(1): Tokenization for Natural Language Processing

Tokenization for Natural Language Processing


Tokenize is mean to break down a sentence to server work, for example:

I have a car. -> I / have / a/ car

Tensorflow provide a tokenize tool:Tokenizer, it can easily to use for tokenize input sentence.


Code:



Result:

{'have': 1, 'a': 2, 'i': 3, 'he': 4, 'apple': 5, 'bike': 6, 'pen': 7, 'car': 8}



Reference:

Machine Learning Foundations: Ep #8 - Tokenization for Natural Language Processing



Linux driver: How to enable dynamic debug at booting time for built-in driver.

 Dynamic debug is useful for debug driver, and can be enable by: 1. Mount debug fs #>mount -t debugfs none /sys/kernel/debug 2. Enable dy...