lpod.element

class lpod.element.odf_element(native_element, cache=None)

Bases: object

Representation of an XML element. Abstraction of the XML library behind.

append(unicode_or_element)

Insert element or text in the last position.

append_named_range(named_range)

Append the named range to the spreadsheet, replacing existing named range of same name if any.

Arguments:

named_range – ODF named nange
clear()

Remove text, children and attributes from the element.

clone()
del_attribute(name)
delete(child=None, keep_tail=True)

Delete the given element from the XML tree. If no element is given, “self” is deleted. The XML library may allow to continue to use an element now “orphan” as long as you have a reference to it.

if keep_tail is True (default), the tail text is not erased.

Arguments:

child – odf_element

keep_tail – boolean (default to True), True for most usages.

delete_named_range(name)

Delete the Named Range of specified name from the spreadsheet.

Arguments:

name – str
elements_repeated_sequence(xpath_instance, name)
extend(odf_elements)

Fast append elements at the end of ourself using extend.

get_annotation(position=0, creator=None, start_date=None, end_date=None, content=None, name=None)

Return the annotation that matches the criteria.

Arguments:

position – int

creator – unicode

start_date – date object

end_date – date object

content – unicode regex

name – unicode

Return: odf_annotation or None if not found

get_annotation_end(position=0, name=None)

Return the annotation end that matches the criteria.

Arguments:

position – int

name – unicode

Return: odf_element or None if not found

get_annotation_ends()

Return all the annotation ends.

Return: list of odf_element

get_annotations(creator=None, start_date=None, end_date=None, content=None)

Return all the annotations that match the criteria.

Arguments:

creator – unicode

start_date – date object

end_date – date object

content – unicode regex

Return: list of odf_annotation

get_attribute(name)
get_attributes()
get_between(tag1, tag2, as_text=False, clean=True, no_header=True)

Returns elements between tag1 and tag2, tag1 and tag2 shall be unique and having an id attribute. (WARN: buggy if tag1/tag2 defines a malformed odf xml.) If as_text is True: returns the text content. If clean is True: suppress unwanted tags (deletions marks, ...) If no_header is True: existing text:h are changed in text:p By default: returns a list of odf_element, cleaned and without headers.

Implementation and standard retrictions: Only text:h and text:p sould be ‘cut’ by an insert tag, so inner parts of insert tags are:

  • any text:h, text:p or sub tag of these
  • some text, part of a parent text:h or text:p

Arguments:

tag1 – odf_element

tag2 – odf_element

as_text – boolean

clean – boolean

no_header – boolean

Return: list of odf_paragraph or odf_header

get_bookmark(position=0, name=None)

Return the bookmark that matches the criteria.

Arguments:

position – int

name – unicode

Return: odf_element or None if not found

get_bookmark_end(position=0, name=None)

Return the bookmark end that matches the criteria.

Arguments:

position – int

name – unicode

Return: odf_element or None if not found

get_bookmark_ends()

Return all the bookmark ends.

Return: list of odf_element

get_bookmark_start(position=0, name=None)

Return the bookmark start that matches the criteria.

Arguments:

position – int

name – unicode

Return: odf_element or None if not found

get_bookmark_starts()

Return all the bookmark starts.

Return: list of odf_element

get_bookmarks()

Return all the bookmarks.

Return: list of odf_element

get_changes_ids()

Return a list of ids that refers to a change region in the tracked changes list.

get_children()
get_dc_creator()

Get dc:creator value.

Return: unicode (or None if inexistant)

get_dc_date()

Get the dc:date value.

Return: datetime (or None if inexistant)

get_document_body()

Return the document body : ‘office:body’

get_draw_connector(position=0, id=None, content=None)

Return the draw connector that matches the criteria.

Arguments:

position – int

id – unicode

content – unicode regex

Return: odf_shape or None if not found

get_draw_connectors(draw_style=None, draw_text_style=None, content=None)

Return all the draw connectors that match the criteria.

Arguments:

draw_style – unicode

draw_text_style – unicode

content – unicode regex

Return: list of odf_shape

get_draw_ellipse(position=0, id=None, content=None)

Return the draw ellipse that matches the criteria.

Arguments:

position – int

id – unicode

content – unicode regex

Return: odf_shape or None if not found

get_draw_ellipses(draw_style=None, draw_text_style=None, content=None)

Return all the draw ellipses that match the criteria.

Arguments:

draw_style – unicode

draw_text_style – unicode

content – unicode regex

Return: list of odf_shape

get_draw_group(position=0, name=None, title=None, description=None, content=None)
get_draw_groups(title=None, description=None, content=None)
get_draw_line(position=0, id=None, content=None)

Return the draw line that matches the criteria.

Arguments:

position – int

id – unicode

content – unicode regex

Return: odf_shape or None if not found

get_draw_lines(draw_style=None, draw_text_style=None, content=None)

Return all the draw lines that match the criteria.

Arguments:

draw_style – unicode

draw_text_style – unicode

content – unicode regex

Return: list of odf_shape

get_draw_page(position=0, name=None, content=None)

Return the draw page that matches the criteria.

Arguments:

position – int

name – unicode

content – unicode regex

Return: odf_draw_page or None if not found

get_draw_pages(style=None, content=None)

Return all the draw pages that match the criteria.

Arguments:

style – unicode

content – unicode regex

Return: list of odf_draw_page

get_draw_rectangle(position=0, id=None, content=None)

Return the draw rectangle that matches the criteria.

Arguments:

position – int

id – unicode

content – unicode regex

Return: odf_shape or None if not found

get_draw_rectangles(draw_style=None, draw_text_style=None, content=None)

Return all the draw rectangles that match the criteria.

Arguments:

draw_style – unicode

draw_text_style – unicode

content – unicode regex

Return: list of odf_shape

get_element(xpath_query)
get_elements(xpath_query)
get_formatted_text(context)

This function must return a beautiful version of the text

get_frame(position=0, name=None, presentation_class=None, title=None, description=None, content=None)

Return the section that matches the criteria.

Arguments:

position – int

title – unicode regex

description – unicode regex

content – unicode regex

Return: odf_frame or None if not found

get_frames(presentation_class=None, style=None, title=None, description=None, content=None)

Return all the frames that match the criteria.

Arguments:

style – unicode

title – unicode regex

description – unicode regex

content – unicode regex

Return: list of odf_frame

get_heading(position=0, outline_level=None, content=None)

Return the heading that matches the criteria.

Arguments:

position – int

content – unicode regex

Return: odf_heading or None if not found

get_headings(style=None, outline_level=None, content=None)

Return all the headings that match the criteria.

Arguments:

style – unicode

content – unicode regex

Return: list of odf_heading

get_image(position=0, name=None, url=None, content=None)

Return the image that matches the criteria.

Arguments:

position – int

content – unicode regex

Return: odf_element or None if not found

get_images(style=None, url=None, content=None)

Return all the sections that match the criteria.

Arguments:

style – str

url – unicode regex

content – unicode regex

Return: list of odf_element

Return the link that matches the criteria.

Arguments:

position – int

name – unicode

title – unicode

url – unicode regex

content – unicode regex

Return: odf_element or None if not found

Return all the links that match the criteria.

Arguments:

name – unicode

title – unicode

url – unicode regex

content – unicode regex

Return: list of odf_element

get_list(position=0, content=None)

Return the list that matches the criteria.

Arguments:

position – int

content – unicode regex

Return: odf_list or None if not found

get_lists(style=None, content=None)

Return all the lists that match the criteria.

Arguments:

style – unicode

content – unicode regex

Return: list of odf_list

get_named_range(name)

Return the named range of specified name, or None if not found.

Arguments:

name – str

Return: odf_named_range

get_named_ranges()

Return all the tables named ranges.

Return: list of odf_named_range

get_next_sibling()
get_note(position=0, note_id=None, note_class=None, content=None)

Return the note that matches the criteria.

Arguments:

position – int

note_id – unicode

note_class – ‘footnote’ or ‘endnote’

content – unicode regex

Return: odf_note or None if not found

get_notes(note_class=None, content=None)

Return all the notes that match the criteria.

Arguments:

note_class – ‘footnote’ or ‘endnote’

content – unicode regex

Return: list of odf_note

get_office_names()

Return all the used office:name tags values of the element.

Return: list of unique str

get_orphan_draw_connectors()

Return a list of connectors which don’t have any shape connected to them.

get_outline_level()
get_paragraph(position=0, content=None)

Return the paragraph that matches the criteria.

Arguments:

position – int

content – unicode regex

Return: odf_paragraph or None if not found

get_paragraphs(style=None, content=None)

Return all the paragraphs that match the criteria.

Arguments:

style – unicode

content – unicode regex

Return: list of odf_paragraph

get_parent()
get_prev_sibling()
get_reference_mark(position=0, name=None)

Return the reference mark that match the criteria. Either single position reference mark (text:reference-mark) or start of range reference (text:reference-mark-start).

Arguments:

position – int

name – unicode

Return: odf_element or None if not found

get_reference_mark_end(position=0, name=None)

Return the reference mark end that matches the criteria. Search only the tags text:reference-mark-end. Consider using : get_reference_marks()

Arguments:

position – int

name – unicode

Return: odf_element or None if not found

get_reference_mark_ends()

Return all the reference mark ends. Search only the tags text:reference-mark-end. Consider using : get_reference_marks()

Return: list of odf_element

get_reference_mark_single(position=0, name=None)

Return the reference mark that matches the criteria. Search only the tags text:reference-mark. Consider using : get_reference_mark()

Arguments:

position – int

name – unicode

Return: odf_element or None if not found

get_reference_mark_start(position=0, name=None)

Return the reference mark start that matches the criteria. Search only the tags text:reference-mark-start. Consider using : get_reference_mark()

Arguments:

position – int

name – unicode

Return: odf_element or None if not found

get_reference_mark_starts()

Return all the reference mark starts. Search only the tags text:reference-mark-start. Consider using : get_reference_marks()

Return: list of odf_element

get_reference_marks()

Return all the reference marks, either single position reference (text:reference-mark) or start of range reference (text:reference-mark-start).

Return: list of odf_element

get_reference_marks_single()

Return all the reference marks. Search only the tags text:reference-mark. Consider using : get_reference_marks()

Return: list of odf_element

get_references(name=None)

Return all the references (text:reference-ref). If name is provided, returns the references of that name.

Return: list of odf_element

Arguments:

name – unicode or None
get_root()
get_section(position=0, content=None)

Return the section that matches the criteria.

Arguments:

position – int

content – unicode regex

Return: odf_element or None if not found

get_sections(style=None, content=None)

Return all the sections that match the criteria.

Arguments:

style – unicode

content – unicode regex

Return: list of odf_element

get_span(position=0, content=None)

Return the span that matches the criteria.

Arguments:

position – int

content – unicode regex

Return: odf_span or None if not found

get_spans(style=None, content=None)

Return all the spans that match the criteria.

Arguments:

style – unicode

content – unicode regex

Return: list of odf_span

get_style(family, name_or_element=None, display_name=None)

Return the style uniquely identified by the family/name pair. If the argument is already a style object, it will return it.

If the name is not the internal name but the name you gave in the desktop application, use display_name instead.

Arguments:

family – ‘paragraph’, ‘text’, ‘graphic’, ‘table’, ‘list’,
‘number’

name_or_element – unicode or odf_style

display_name – unicode

Return: odf_style or None if not found

get_styled_elements(name=True)

Brute-force to find paragraphs, tables, etc. using the given style name (or all by default).

Arguments:

name – unicode

Return: list

get_styles(family=None)
get_svg_description()
get_svg_title()
get_table(position=0, name=None, content=None)

Return the table that matches the criteria.

Arguments:

position – int

name – unicode

content – unicode regex

Return: odf_table or None if not found

get_tables(style=None, content=None)

Return all the tables that match the criteria.

Arguments:

style – unicode

content – unicode regex

Return: list of odf_table

get_tag()

Return the tag name of the element as a qualified name, e.g. “text:span”.

Return: str

get_tail()

Return the text immediately following the element.

Inspired by lxml.

get_text(recursive=False)

Return the text content of the element.

If recursive is True, all text contents of the subtree.

get_text_change(position=0, idx=None)

Return the text change that matches the criteria. Either single deletion (text:change) or start of range of changes (text:change-start). position : index of the element to retrieve if several matches, default is 0. idx : change-id of the element.

Arguments:

position – int

idx – unicode

Return: odf_element or None if not found

get_text_change_deletion(position=0, idx=None)

Return the text change of deletion kind that matches the criteria. Search only for the tags text:change. Consider using : get_text_change()

Arguments:

position – int

idx – unicode

Return: odf_element or None if not found

get_text_change_deletions()

Return all the text changes of deletion kind: the tags text:change. Consider using : get_text_changes()

Return: list of odf_element

get_text_change_end(position=0, idx=None)

Return the text change-end that matches the criteria. Search only the tags text:change-end. Consider using : get_text_change()

Arguments:

position – int

idx – unicode

Return: odf_element or None if not found

get_text_change_ends()

Return all the text change-end. Search only the tags text:change-end. Consider using : get_text_changes()

Return: list of odf_element

get_text_change_start(position=0, idx=None)

Return the text change-start that matches the criteria. Search only the tags text:change-start. Consider using : get_text_change()

Arguments:

position – int

idx – unicode

Return: odf_element or None if not found

get_text_change_starts()

Return all the text change-start. Search only for the tags text:change-start. Consider using : get_text_changes()

Return: list of odf_element

get_text_changes()

Return all the text changes, either single deletion (text:change) or start of range of changes (text:change-start).

Return: list of odf_element

get_text_content()

Like “get_text” but return the text of the embedded paragraph: annotations, cells...

get_toc(position=0, content=None)

Return the table of contents that matches the criteria.

Arguments:

position – int

content – unicode regex

Return: odf_toc or None if not found

get_tocs()

Return all the tables of contents.

Return: list of odf_toc

get_tracked_changes()

Return the tracked-changes part in the text body.

get_user_defined(name, position=0)

return the user defined declaration for the given name.

return: odf_element or none if not found

get_user_defined_list()

Return all the user defined field declarations.

Return: list of odf_element

get_user_defined_value(name, value_type=None)

Return the value of the given user defined field name.

Arguments:

name – unicode

value_type – ‘boolean’, ‘date’, ‘float’,
‘string’, ‘time’ or automatic

Return: most appropriate Python type

get_user_field_decl(name, position=0)

return the user field declaration for the given name.

return: odf_element or none if not found

get_user_field_decl_list()

Return all the user field declarations.

Return: list of odf_element

get_user_field_decls()

Return the container for user field declarations. Created if not found.

Return: odf_element

get_user_field_value(name, value_type=None)

Return the value of the given user field name.

Arguments:

name – unicode

value_type – ‘boolean’, ‘currency’, ‘date’, ‘float’,
‘percentage’, ‘string’, ‘time’ or automatic

Return: most appropriate Python type

get_variable_decl(name, position=0)

return the variable declaration for the given name.

return: odf_element or none if not found

get_variable_decl_list()

Return all the variable declarations.

Return: list of odf_element

get_variable_decls()

Return the container for variable declarations. Created if not found.

Return: odf_element

get_variable_set(name, position=-1)

Return the variable set for the given name (last one by default).

Arguments:

name – unicode

position – int

Return: odf_element or None if not found

get_variable_set_value(name, value_type=None)

Return the last value of the given variable name.

Arguments:

name – unicode

value_type – ‘boolean’, ‘currency’, ‘date’, ‘float’,
‘percentage’, ‘string’, ‘time’ or automatic

Return: most appropriate Python type

get_variable_sets(name=None)

Return all the variable sets that match the criteria.

Arguments:

name – unicode

Return: list of odf_element

index(child)

Return the position of the child in this element.

Inspired by lxml

insert(element, xmlposition=None, position=None, start=False)

Insert an element relatively to ourself.

Insert either using DOM vocabulary or by numeric position. If text start is True, insert the element before any existing text.

Position start at 0.

Arguments:

element – odf_element

xmlposition – FIRST_CHILD, LAST_CHILD, NEXT_SIBLING
or PREV_SIBLING

start – Boolean

position – int

is_empty()

Check if the element is empty : no text, no children, no tail

Return: Boolean

match(pattern)

return True if the pattern is found one or more times anywhere in the text content of the element.

Python regular expression syntax applies.

Arguments:

pattern – unicode

Return: bool

replace(pattern, new=None)

Replace the pattern with the given text, or delete if text is an empty string, and return the number of replacements. By default, only return the number of occurences that would be replaced.

It cannot replace patterns found across several element, like a word split into two consecutive spans.

Python regular expression syntax applies.

Arguments:

pattern – unicode

new – unicode

Return: int

replace_document_body(new_body)

Change in place the full document body content.

replace_element(old_element, new_element)

Replaces in place a sub element with the element passed as second argument.

Warning : no clone for old element.

search(pattern)

Return the first position of the pattern in the text content of the element, or None if not found.

Python regular expression syntax applies.

Arguments:

pattern – unicode

Return: int or None

serialize(pretty=False, with_ns=False)
set_attribute(name, value)
set_dc_creator(creator)

Set dc:creator value.

Arguments:

creator – unicode
set_dc_date(date)

Set the dc:date value.

Arguments:

darz – DateTime
set_outline_level(outline_level)
set_style_attribute(name, value)

Shortcut to accept a style object as a value.

set_svg_description(description)
set_svg_title(title)
set_tag(qname)

Change the tag name of the element with the given qualified name. Return a new element as there may be a more appropriate class afterwards. XXX side effects?

Arguments:

qname – str

Return: odf_element or a subclass

set_tail(text)

Set the text immediately following the element.

Inspired by lxml.

set_text(text)

Set the text content of the element.

set_text_content(text)

Like “set_text” but set the text of the embedded paragraph: annotations, cells...

Create the paragraph if missing.

strip_elements(sub_elements)

Remove the tags of provided elements, keeping inner childs and text.

Return : the striped element.

Warning : no clone in sub_elements list.

Arguments:

sub_elements – odf_element or list of odf_element
strip_tags(strip=None, protect=None, default='text:p')

Remove the tags listed in strip, recursively, keeping inner childs and text. Tags listed in protect stop the removal one level depth. If the first level element is stripped, default is used to embed the content in the default element. If default is None and first level is striped, a list of text and children is returned. Return : the striped element.

strip_tags should be used by on purpose methods (strip_span ...) (Method name taken from lxml).

Arguments:

strip – iterable list of unicode odf tags, or None

protect – iterable list of unicode odf tags, or None

default – unicode odf tag, or None

Return:

odf_element or list.
xpath(xpath_query)

Apply XPath query to the element and its subtree. Return list of odf_element or odf_text instances translated from the nodes found.

class lpod.element.odf_text(text_result)

Bases: unicode

Representation of an XML text node. Created to hide the specifics of lxml in searching text nodes using XPath.

Constructed like any unicode object but only accepts lxml text objects.

get_parent()
is_tail()
is_text()
lpod.element.odf_create_element(element_data, cache=None)
lpod.element.register_element_class(qname, cls, family=None, caching=False)

Associate a qualified element name to a Python class that handles this type of element.

Getting the right Python class when loading an existing ODF document is then transparent. Unassociated elements will be handled by the base odf_element class.

Most styles use the “style:style” qualified name and only differ by their “style:family” attribute. So the “family” attribute was added to register specialized style classes.

Arguments:

qname – str

cls – Python class

family – str

Previous topic

lpod.draw_page

Next topic

lpod.experimental

This Page