There was a problem hiding this comment.
@Benjamin-T I've left some explanatory comments where the merged commits differ from those submitted with your PR.
Sorry, something went wrong.
|
|
||
| * ? Do bookmarks need to be unique across all stories? (like headers, footers, | ||
| etc.)? This could be trouble for us because we don't yet have access to | ||
| those "stories". |
There was a problem hiding this comment.
There are a couple new open questions here; this one in particular I think will be important for us to bear in mind as we go forward. I expect we should design with the prospect of Bookmarks being across all stories in the document rather than just the main document body, and I think this feature is going to be vulnerable to repair-errors for that reason.
Sorry, something went wrong.
| .. toctree:: | ||
| :titlesonly: | ||
|
|
||
| features/bookmarks |
There was a problem hiding this comment.
I generally place new analysis documents at the top of the list rather than try to find a logical ordering. This makes them easy to find in the most likely case, which is when you're actively working on the feature.
Sorry, something went wrong.
| @wip | ||
| Scenario: Document.bookmarks | ||
| Given a Document object as document | ||
| Then document.bookmarks is a Bookmarks object |
There was a problem hiding this comment.
The project's style for acceptance tests has evolved somewhat since many of these were written. This one reflects the current style, which adopts the perspective that the "user" in these user acceptance tests is a developer using the API. So the scenario name is usually terse like this, often {object}.{method or property}, sometimes with "getter" or "setter" appended when it's two aspects of a property.
The given statement in this style mentions the "variable" name used for the object in an "as" clause at the end.
This test is quite simplified from your original version. This reflects that there are three "levels" or "steps" that are tested, each individually:
A document object has a .bookmarks property that returns (unconditionally and idempotently in this case) a Bookmarks object.
A Bookmarks object has certain behaviors (len, indexed access, interable, etc.)
An individual Bookmark object has certain behaviors.
These are each tested separately and not mixed in one test.
Sorry, something went wrong.
|
|
||
| @given('a Document object as document') | ||
| def given_a_Document_object_as_document(context): | ||
| context.document = Document(test_docx('doc-default')) |
There was a problem hiding this comment.
No need for a special document file yet, and better that we don't introduce one for this because bookmarks should work even when the document has none. We'll add your test file in a later commit.
Sorry, something went wrong.
| bookmarks = context.document.bookmarks | ||
| actual = bookmarks.__class__.__name__ | ||
| expected = 'Bookmarks' | ||
| assert actual == expected, 'bookmarks is a %s object' % actual |
There was a problem hiding this comment.
While it may seem a trivial test, all we want to do here is establish we got a Bookmarks object on this call. Any particular behaviors of that bookmarks object is for another test. These "..is an X object" tests are quite effective at driving (and defending) the top-level behaviors they're designed to test.
Sorry, something went wrong.
| from docx.enum.section import WD_SECTION | ||
| from docx.enum.text import WD_BREAK | ||
| from docx.section import Section, Sections | ||
| from docx.shared import ElementProxy, Emu, lazyproperty |
There was a problem hiding this comment.
In the past I used relative imports (e.g. '.' and '..' prefixes) in this project, but have since determined this is not best practice. So whenever I have to add an import I fix the others in that module. Note they are sorted in alphabetical order at the top of the module and items within the line as well. This makes the position for a new import predictable and easy to review.
Sorry, something went wrong.
| Conflicts may arise if a bookmark is added with the same name as one | ||
| appearing in another story such as a header or footer etc. | ||
| """ | ||
| return self._body.bookmarks |
There was a problem hiding this comment.
Because of the multi-story scope of bookmarks, I'm inclined to delegate ownership of the actual Bookmarks object to _Body, like .paragraphs does with those. Eventually this Document-level bookmarks may become an "aggregate" object that contains bookmarks from all stories in the document; so this seam will allow us to add that later (and incrementally) with a minimum of disruption.
Sorry, something went wrong.
| @lazyproperty | ||
| def bookmarks(self): | ||
| """|Bookmarks| object providing access to main story bookmarks.""" | ||
| raise NotImplementedError |
There was a problem hiding this comment.
When needed for mocking purposes, a new method/property is added as a "stub". It nevertheless always has a complete docstring (that is formatted according to PEP 257). Its only line is raise NotImplementedError which causes the acceptance test to fail at just the right spot when it's next in line for implementation.
Sorry, something went wrong.
|
|
||
| bookmarks = document.bookmarks | ||
|
|
||
| assert bookmarks is bookmarks_ |
There was a problem hiding this comment.
In the past, I used to always have a fixture for each test. Now I only use a test fixture when there are multiple cases (i.e. fixture needs params=[]). It seems like it makes it easier to have all the fixture-making right there with the test when that's possible. This doesn't apply to fixture components though (single mocks signified by a trailing underscore ('_') in their name. Those are just used directly by the test rather that a fixture.
Sorry, something went wrong.
|
|
||
| @pytest.fixture | ||
| def _block_width_prop_(self, request): | ||
| return property_mock(request, Document, '_block_width') |
There was a problem hiding this comment.
I maintain fixture components in strict alphabetical order by name. This one was out of order, which explains this change not directly related to this test. Keeping them alphabetical speeds finding the one you're after when scanning the source, especially when code-folding and search aren't handy.
Sorry, something went wrong.
|
@scanny |
Sorry, something went wrong.
|
@Benjamin-T Hi Benjamin, I did some work on this over the weekend:
Please review this to understand the changes I made and why, then rebase your branch on my feature/bookmarks such that your remaining commits proceed from those that have been committed here (and do not also include commits already on my feature/bookmarks branch. You may prefer to do this by cherry-picking; any way that works for you is fine with me. Then I'll take another look. |
Sorry, something went wrong.
|
@scanny |
Sorry, something went wrong.
|
Dear @scanny I've rebased my branch and added some commits. I've also updated the branch with an additional acceptance test. If you could take a look and see if they can be merged that would be much appreciated. (Note that TravisCI says the branch is failing, but it really only is the python 3.4 test which is crashing)
The last commit bmk: add BaseStoryPart.iter_story_parts() is a refactor proposal: I haven't changed it in my current branch in order to make it more obvious what I mean. Curious about your thoughts on the matter. regards , Ben |
Sorry, something went wrong.
|
There are some pending commits in my feature branch. Please let me know when you've looked at them so we can move forward. thanks for time Ben |
Sorry, something went wrong.
|
Hi Steve, I somehow missed your new commits to this PR - Great stuff! I've experimented a bit with it and discovered that even though the bookmarks added do show up in the word editor, they are somewhat different. The main issue is that python-docx places them at the same level as the paragraph elements, whereas the word editor inserts the bookmark elements on a paragraph level. I've checked this for the body and header stories and assume that this holds for the other story-parts as well. The current implementation results in this xml: from docx import Document
doc = Document()
bmk = doc.start_bookmark('test')
doc.end_bookmark(bmk)
doc.save('test_bookmark.docx')
<w:document>
<w:body>
<w:bookmarkStart w:name="test" w:id="1" />
<w:bookmarkEnd w:id="1" />
</w:body>
</w:document>
Whereas the word editor creates this xml: <w:document>
<w:body>
<w:p>
<w:bookmarkStart w:name="test" w:id="0" />
<w:bookmarkEnd w:id="0" />
</w:p>
</w:body>
</w:document>
Edit: An even more peculiar thing happens when two body-level bookmarks are added: The first bookmark can be referred to whereas the latter leads to the same error message. My conclusion is that the word-editor tries to prevent these self-reference errors by raising the bookmark elements to the paragraph level. What would your suggestion be? Continue with this compliant implementation which gets updated upon save by the word editor or only add bookmarks in paragraph elements. Or am I missing a point here? My suggestion would be to only implement the bookmarks on paragraph level. Which would either require the user to first add a paragraph before being able to add a bookmark, or provide a convenience function on document level which adds a paragraph element before adding the bookmark to it. For reference: |
Sorry, something went wrong.
|
@Benjamin-T I think the safest bet is to determine what the Microsoft API for Word does and reproduce that behavior. Because details like this are not documented (publicly at least), the only good assurance I've found is to duplicate the MS API behavior, which is pretty good assurance in my experience, although sometimes a mystery why it needs to be the way it is. This is an example of why the analysis document is so important, in particular doing the thorough analysis work needed to properly create that document. I of course learned that the hard way, by reimplementing several features early on, but this is definitely a "teaching moment" opportunity :) There are a couple ways to experiment with the Microsoft API, but they all require Windows as far as I know. In general I think using VBA right from within Word is the easiest method. I used to keep a Windows virtual machine around (on my Mac) to do that sort of thing but don't have it anymore. |
Sorry, something went wrong.
|
Dear @scanny Thanks for the response, I've run a test using the Word API: From this I think we can conclude that the 'correct' place of the bookmark elements is paragraph level. Sub Mark()
ActiveDocument.Bookmarks.Add Name:="mark"
End Sub
Bookmark at start of documentResults in: <w:body xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<w:p w:rsidR="000E0575" w:rsidRDefault="000E0575" w14:paraId="6AE70610" w14:textId="77777777" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml">
<w:bookmarkStart w:name="myplace" w:id="0" />
<w:bookmarkEnd w:id="0" />
</w:p>
<w:sectPr w:rsidR="000E0575">
<w:pgSz w:w="11906" w:h="16838" />
<w:pgMar w:top="1440" w:right="1440" w:bottom="1440" w:left="1440" w:header="708" w:footer="708" w:gutter="0" />
<w:cols w:space="708" />
<w:docGrid w:linePitch="360" />
</w:sectPr>
</w:body>
Perhaps interesting, the corresponding generated code: using DocumentFormat.OpenXml.Wordprocessing;
using DocumentFormat.OpenXml;
namespace GeneratedCode
{
public class GeneratedClass
{
// Creates an Body instance and adds its children.
public Body GenerateBody()
{
Body body1 = new Body();
Paragraph paragraph1 = new Paragraph(){ RsidParagraphAddition = "000E0575", RsidRunAdditionDefault = "000E0575", ParagraphId = "6AE70610", TextId = "77777777" };
BookmarkStart bookmarkStart1 = new BookmarkStart(){ Name = "myplace", Id = "0" };
BookmarkEnd bookmarkEnd1 = new BookmarkEnd(){ Id = "0" };
paragraph1.Append(bookmarkStart1);
paragraph1.Append(bookmarkEnd1);
body1.Append(paragraph1);
return body1;
}
}
}
Bookmark inside a runI've also placed a bookmark using the same code with the curser inside a run: <w:document>
<w:body>
<w:p w:rsidR="000E0575" w:rsidRDefault="00390F75" w14:paraId="6AE70610" w14:textId="1AE1D158">
<w:r>
<w:t xml:space="preserve">This is a bookmark </w:t>
</w:r>
<w:bookmarkStart w:name="myplace" w:id="0" />
<w:bookmarkEnd w:id="0" />
<w:r>
<w:t>inside a run</w:t>
</w:r>
</w:p>
<w:p w:rsidR="00390F75" w:rsidRDefault="00390F75" w14:paraId="6428D04B" w14:textId="77777777" />
<w:sectPr w:rsidR="00390F75">
<w:pgSz w:w="11906" w:h="16838" />
<w:pgMar w:top="1440" w:right="1440" w:bottom="1440" w:left="1440" w:header="708" w:footer="708" w:gutter="0" />
<w:cols w:space="708" />
<w:docGrid w:linePitch="360" />
</w:sectPr>
</w:body>
</w:document>
using DocumentFormat.OpenXml.Wordprocessing;
using DocumentFormat.OpenXml;
namespace GeneratedCode
{
public class GeneratedClass
{
// Creates an Body instance and adds its children.
public Body GenerateBody()
{
Body body1 = new Body();
Paragraph paragraph1 = new Paragraph(){ RsidParagraphAddition = "000E0575", RsidRunAdditionDefault = "00390F75", ParagraphId = "6AE70610", TextId = "1AE1D158" };
Run run1 = new Run();
Text text1 = new Text(){ Space = SpaceProcessingModeValues.Preserve };
text1.Text = "This is a bookmark ";
run1.Append(text1);
BookmarkStart bookmarkStart1 = new BookmarkStart(){ Name = "myplace", Id = "0" };
BookmarkEnd bookmarkEnd1 = new BookmarkEnd(){ Id = "0" };
Run run2 = new Run();
Text text2 = new Text();
text2.Text = "inside a run";
run2.Append(text2);
paragraph1.Append(run1);
paragraph1.Append(bookmarkStart1);
paragraph1.Append(bookmarkEnd1);
paragraph1.Append(run2);
Paragraph paragraph2 = new Paragraph(){ RsidParagraphAddition = "00390F75", RsidRunAdditionDefault = "00390F75", ParagraphId = "6428D04B", TextId = "77777777" };
body1.Append(paragraph1);
body1.Append(paragraph2);
return body1;
}
}
}
Bookmark around specific text in run(And also added a field to test whether the reference is also working correctly) <w:body xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<w:p w:rsidR="000E0575" w:rsidRDefault="00390F75" w14:paraId="6AE70610" w14:textId="5ABFDEE5" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml">
<w:r>
<w:t xml:space="preserve">This is a </w:t>
</w:r>
<w:bookmarkStart w:name="myplace" w:id="0" />
<w:r>
<w:t>bookmark inside</w:t>
</w:r>
<w:bookmarkEnd w:id="0" />
<w:r>
<w:t xml:space="preserve"> a run</w:t>
</w:r>
</w:p>
<w:p w:rsidR="004C3C5C" w:rsidRDefault="004C3C5C" w14:paraId="0DD6D36E" w14:textId="0579ED72" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" />
<w:p w:rsidR="004C3C5C" w:rsidRDefault="004C3C5C" w14:paraId="3196D0A8" w14:textId="17C6E3EF" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml">
<w:r>
<w:fldChar w:fldCharType="begin" />
</w:r>
<w:r>
<w:instrText xml:space="preserve"> REF myplace \h </w:instrText>
</w:r>
<w:r>
<w:fldChar w:fldCharType="separate" />
</w:r>
<w:r>
<w:t>bookmark inside</w:t>
</w:r>
<w:r>
<w:fldChar w:fldCharType="end" />
</w:r>
</w:p>
<w:p w:rsidR="00390F75" w:rsidRDefault="00390F75" w14:paraId="6428D04B" w14:textId="77777777" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" />
<w:sectPr w:rsidR="00390F75">
<w:pgSz w:w="11906" w:h="16838" />
<w:pgMar w:top="1440" w:right="1440" w:bottom="1440" w:left="1440" w:header="708" w:footer="708" w:gutter="0" />
<w:cols w:space="708" />
<w:docGrid w:linePitch="360" />
</w:sectPr>
</w:body>
|
Sorry, something went wrong.
|
Have you plan to merge it soon ? |
Sorry, something went wrong.
This PR provides a place for me to comment on differences between merged commits and those submitted on PR #445.