Flexible and Creative Chinese Poetry Generation Using Neural
Flexible and Creative Chinese Poetry Generation Using Neural Memory Jiyuan Zhang NLP Group, CSLT, Tsinghua University
Outline • Introduction • The memory-augmented neural model(The MNM) • The neural model part of the MNM • The memory part of the MNM • The analysis of memory mechanism • Evaluation & Experiments • Conclusions • Future work
Introduction • Statistical or rule-based model: • Based on the surface forms of words or characters, having no deep understanding of the meaning of a poem. • Neural model: • Discover the meaning of words or characters, and can therefore more deeply understand the meaning of a poem. • Compared to previous approaches(e. g. , rule-based or SM), the neural model approach tends to generate more fluent poems and some generations are so natural that even professional poets can not tell they are the work of machines.
Introduction • A problem about neural model: • Neural models is very good at learning abstract rules. • The more regular and common the patterns, the better the neural model is good at learning them and tends to use them more frequently at run-time. • The property of neural network leads to the problem, a lack of innovation in poem generation. • A solution to this problem: • A memory-augmented neural network that we proposes can partially solve the problem about innovation.
Introduction • The idea about memory inspired by: • A human poet who creates poems by not only referring to common rules and patterns, but also recalls poems that he has read before. • The effect of the memory: • Balance the requirements of linguistic accordance and aesthetic innovation, leading to innovative generations that are still rulecompliant. • Provide interesting flexibility that can be used to generate poems with different styles.
Outline • Introduction • The memory-augmented neural model(The MNM) • The neural model part of The MNM • The memory part of The MNM • The analysis of memory mechanism • Evaluation & Experiments • Conclusions • Future work
The memory-augmented neural model • The function of the memory • The memory part is not trained, the memory works in prediction • an effective regularization that constrains and modifies the behavior of the neural model, resulting in generations with desired properties. • understand the memory-augmented neural model • a way of combining reasoning (neural model) and knowledge (memory). • a way of combining rule-based inference (neural model) and instance-based retrieval (memory). • a way of combining predictions from complementary systems, where the neural model is continuous and parameter-shared, while the memory is discrete and contains no parameter sharing.
Outline • Introduction • The memory-augmented neural model (The MNM) • The neural model part of The MNM • The memory part of The MNM • The analysis of memory mechanism • Evaluation & Experiments • Conclusions • Future work
Neural model part of MNM •
Outline • Introduction • The memory-augmented neural model(The MNM) • The neural model part of The MNM • The memory part of The MNM • The analysis of memory mechanism • Evaluation & Experiments • Conclusions • Future work
The memory part of The MNM • Memory consists of 3 modules • Source memory: • mi (s) = fd (xj-1 , sj-1, 0) • Target memory: • mi (g) = xj • Weights: the memory elements are selected according to their fit to the present decoder status st , choose cosine distance to measure the fitting degree. • The output of memory part:
The output of MNM • The output of the neural model and the memory : • The β is not better than the manually-selected one.
Outline • Introduction • The memory-augmented neural model(The MNM) • The neural model part of The MNM • The memory part of The MNM • The analysis of memory mechanism • Evaluation & Experiments • Conclusions • Future work
The analysis of memory mechanism • Three scenarios where adding a memory may contribute: • A one-iteration neural model where we aim to promote innovation by adding a memory • an over-fitted neural model where we hope the memory can regularize the innovation • the memory is used to encourage generation of poems of different styles
The analysis of memory mechanism
Outline • Introduction • The memory-augmented neural model(The MNM) • The neural model part of The MNM • The memory part of The MNM • The analysis of memory mechanism • Evaluation & Experiments • Conclusions • Future work
Evaluation metrics • five metrics to evaluate the generation: Compliance: if regulations on tones and rhymes are satisfied; Fluency: if the sentences read fluently and convey reasonable meaning; Theme consistency: if the entire poem adheres to a single theme; Aesthetic innovation: if the quatrain stimulates any aesthetic feeling with elaborate innovation; • Scenario consistency: if the scenario remains consistent. • •
Evaluation process • In innovation experiment: • The innovation questions presented the expert with two poems, and asked them to judge which of the poems was better in terms of the five metrics. • In style-transfer experiment: • Each of the style-transfer questions presented the expert with a single poem and asked them to score it between 1 to 5, with a larger score being better, in terms of compliance, aesthetic innovation, scenario consistency, and fluency. They were also asked to specify the style of the poem.
Evaluation process • In innovation experiment: • The innovation questions presented the expert with two poems, and asked them to judge which of the poems was better in terms of the five metrics. • In style-transfer experiment: • Each of the style-transfer questions presented the expert with a single poem and asked them to score it between 1 to 5, with a larger score being better, in terms of compliance, aesthetic innovation, scenario consistency, and fluency. They were also asked to specify the style of the poem.
Experiments (innovation) • Dataset: • 500 quatrains randomly selected from our training corpus • Two configuration: • one is with a one-iteration model (C 1) and the other is with an overfitted model (C�).
Experiments (innovation)
Experiments (style-transfer) • Dataset: • contains 300 quatrains with clear styles, including 100 pastoral, 100 battlefield and 100 romantic quatrains. • Two configuration: • one is with a one-iteration model (C 1) and the other is with an overfitted model (C�).
Experiments (style-transfer)
Outline • Introduction • The memory-augmented neural model(The MNM) • The neural model part of The MNM • The memory part of The MNM • The analysis of memory mechanism • Evaluation & Experiments • Conclusions • Future work
Conclusions • The memory can encourage creative generation for regularly-trained models. • The memory can encourage rule-compliance for overfitted models. • The memory can modify the style of the generated poems in a flexible way.
Outline • Introduction • The memory-augmented neural model(The MNM) • The neural model part of The MNM • The memory part of The MNM • The analysis of memory mechanism • Evaluation & Experiments • Conclusions • Future work
Future work • Investigating a better memory selection scheme • Other regularization methods (e. g. , norm or drop out) may alleviate the over-fitting problem.
Thanks! Q&A
- Slides: 28